In [216]:
# Import all of the things you need to import!
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.feature_extraction.text import TfidfVectorizer
import re
from nltk.stem.porter import PorterStemmer
from sklearn.feature_extraction.text import CountVectorizer

pd.options.display.max_columns = 30
%matplotlib inline

Homework 14 (or so): TF-IDF text analysis and clustering

Hooray, we kind of figured out how text analysis works! Some of it is still magic, but at least the TF and IDF parts make a little sense. Kind of. Somewhat.

No, just kidding, we're professionals now.

Investigating the Congressional Record

The Congressional Record is more or less what happened in Congress every single day. Speeches and all that. A good large source of text data, maybe?

Let's pretend it's totally secret but we just got it leaked to us in a data dump, and we need to check it out. It was leaked from this page here.


In [217]:
# If you'd like to download it through the command line...
!curl -O http://www.cs.cornell.edu/home/llee/data/convote/convote_v1.1.tar.gz


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 9607k  100 9607k    0     0   598k      0  0:00:16  0:00:16 --:--:--  548k

In [218]:
# And then extract it through the command line...
!tar -zxf convote_v1.1.tar.gz

You can explore the files if you'd like, but we're going to get the ones from convote_v1.1/data_stage_one/development_set/. It's a bunch of text files.


In [219]:
# glob finds files matching a certain filename pattern
import glob

# Give me all the text files
paths = glob.glob('convote_v1.1/data_stage_one/development_set/*')
paths[:5]


Out[219]:
['convote_v1.1/data_stage_one/development_set/052_400011_0327014_DON.txt',
 'convote_v1.1/data_stage_one/development_set/052_400011_0327025_DON.txt',
 'convote_v1.1/data_stage_one/development_set/052_400011_0327044_DON.txt',
 'convote_v1.1/data_stage_one/development_set/052_400011_0327046_DON.txt',
 'convote_v1.1/data_stage_one/development_set/052_400011_1479036_DON.txt']

In [220]:
len(paths)


Out[220]:
702

So great, we have 702 of them. Now let's import them.


In [221]:
speeches = []
for path in paths:
    with open(path) as speech_file:
        speech = {
            'pathname': path,
            'filename': path.split('/')[-1],
            'content': speech_file.read()
        }
    speeches.append(speech)
speeches_df = pd.DataFrame(speeches)
speeches_df.head()


Out[221]:
content filename pathname
0 mr. chairman , i thank the gentlewoman for yie... 052_400011_0327014_DON.txt convote_v1.1/data_stage_one/development_set/05...
1 mr. chairman , i want to thank my good friend ... 052_400011_0327025_DON.txt convote_v1.1/data_stage_one/development_set/05...
2 mr. chairman , i rise to make two fundamental ... 052_400011_0327044_DON.txt convote_v1.1/data_stage_one/development_set/05...
3 mr. chairman , reclaiming my time , let me mak... 052_400011_0327046_DON.txt convote_v1.1/data_stage_one/development_set/05...
4 mr. chairman , i thank my distinguished collea... 052_400011_1479036_DON.txt convote_v1.1/data_stage_one/development_set/05...

In class we had the texts variable. For the homework can just do speeches_df['content'] to get the same sort of list of stuff.

Take a look at the contents of the first 5 speeches


In [222]:
All_speeches = speeches_df['content']
First_five_speeches = speeches_df['content'].head(5)
First_five_speeches


Out[222]:
0    mr. chairman , i thank the gentlewoman for yie...
1    mr. chairman , i want to thank my good friend ...
2    mr. chairman , i rise to make two fundamental ...
3    mr. chairman , reclaiming my time , let me mak...
4    mr. chairman , i thank my distinguished collea...
Name: content, dtype: object

Doing our analysis

Use the sklearn package and a plain boring CountVectorizer to get a list of all of the tokens used in the speeches. If it won't list them all, that's ok! Make a dataframe with those terms as columns.

Be sure to include English-language stopwords


In [223]:
count_vectorizer = CountVectorizer(stop_words='english')

In [224]:
speech_tokens = count_vectorizer.fit_transform(All_speeches)

In [225]:
count_vectorizer.get_feature_names()


Out[225]:
['000',
 '00007',
 '018',
 '050',
 '092',
 '10',
 '100',
 '106',
 '107',
 '108',
 '108th',
 '109th',
 '10th',
 '11',
 '110',
 '114',
 '117',
 '118',
 '11th',
 '12',
 '120',
 '121',
 '122',
 '123',
 '125',
 '128',
 '12898',
 '13',
 '13279',
 '1332',
 '1335',
 '1344',
 '135',
 '138',
 '14',
 '140',
 '143',
 '144',
 '145',
 '149',
 '1498',
 '14th',
 '15',
 '150',
 '1520',
 '153',
 '155',
 '159',
 '16',
 '160',
 '162',
 '163',
 '165',
 '1671',
 '1675',
 '17',
 '170',
 '1700',
 '174',
 '178',
 '1787',
 '17th',
 '18',
 '180',
 '1800',
 '1800s',
 '181',
 '1812',
 '1855',
 '186',
 '1868',
 '18th',
 '19',
 '190',
 '1907',
 '1922',
 '1927',
 '1930',
 '1940s',
 '1950s',
 '196',
 '1960',
 '1960s',
 '1964',
 '1965',
 '1967',
 '1970s',
 '1971',
 '1972',
 '1973',
 '1974',
 '1976',
 '1979',
 '198',
 '1980s',
 '1981',
 '1982',
 '1983',
 '1984',
 '1985',
 '1986',
 '1987',
 '1988',
 '1989',
 '1990',
 '1990s',
 '1991',
 '1992',
 '1993',
 '1994',
 '1995',
 '1996',
 '1997',
 '1998',
 '1999',
 '19th',
 '1st',
 '20',
 '200',
 '2000',
 '2001',
 '2002',
 '2003',
 '2004',
 '2005',
 '2006',
 '2007',
 '2008',
 '2011',
 '2016',
 '202',
 '2072',
 '20th',
 '21',
 '2123',
 '2132',
 '214',
 '216',
 '21st',
 '22',
 '220',
 '2210',
 '2217',
 '222',
 '223',
 '225',
 '226',
 '229',
 '23',
 '231',
 '2324',
 '234',
 '2361',
 '23rd',
 '24',
 '240',
 '241',
 '2411',
 '242',
 '2451',
 '248',
 '25',
 '250',
 '2586',
 '26',
 '261',
 '263',
 '2646',
 '26th',
 '27',
 '270',
 '273',
 '275',
 '278',
 '279',
 '28',
 '283',
 '2844',
 '286',
 '287',
 '2882',
 '2884',
 '2888',
 '29',
 '2904',
 '2926',
 '293',
 '2934',
 '2944',
 '297',
 '2975',
 '2985',
 '2d',
 '2nd',
 '30',
 '300',
 '3000',
 '3004',
 '3005',
 '3006',
 '301',
 '302',
 '303',
 '304',
 '305',
 '306',
 '3061',
 '309',
 '3090',
 '30s',
 '31',
 '310',
 '311',
 '3130',
 '3160',
 '3162',
 '317',
 '32',
 '3238',
 '327',
 '3283',
 '329',
 '33',
 '3306',
 '332',
 '336',
 '34',
 '340',
 '345',
 '35',
 '350',
 '352',
 '353',
 '36',
 '365',
 '37',
 '37th',
 '38',
 '383',
 '387',
 '388',
 '39',
 '397',
 '40',
 '400',
 '40th',
 '41',
 '413',
 '42',
 '420',
 '421',
 '427',
 '43',
 '435',
 '439',
 '44',
 '440',
 '442',
 '45',
 '450',
 '454',
 '455',
 '457',
 '4571',
 '461',
 '465',
 '469',
 '47',
 '479',
 '48',
 '482',
 '483',
 '487',
 '488',
 '49',
 '492',
 '4th',
 '50',
 '500',
 '501',
 '502',
 '5064',
 '508',
 '51',
 '5135',
 '52',
 '521',
 '525',
 '526',
 '53',
 '5304',
 '5305',
 '5306',
 '533',
 '53857',
 '539',
 '54',
 '543',
 '544',
 '55',
 '554',
 '562',
 '564',
 '57',
 '574',
 '58',
 '587',
 '589',
 '59',
 '5th',
 '60',
 '600',
 '604',
 '605',
 '6070',
 '609',
 '612',
 '62',
 '63',
 '6370',
 '639',
 '64',
 '641',
 '65',
 '650',
 '653',
 '66',
 '67',
 '670',
 '672',
 '675',
 '68',
 '69',
 '692',
 '698',
 '70',
 '700',
 '701',
 '702',
 '719',
 '72',
 '724',
 '74',
 '743',
 '75',
 '750',
 '751',
 '754',
 '778',
 '79',
 '80',
 '800',
 '82',
 '822',
 '83',
 '830',
 '831',
 '84',
 '8400',
 '841',
 '845',
 '8494',
 '85',
 '850',
 '865',
 '868',
 '87',
 '870',
 '90',
 '900',
 '91',
 '912',
 '924',
 '92nd',
 '93',
 '94',
 '9500',
 '96',
 '97',
 '970',
 '975',
 '97th',
 '98',
 '9849',
 '99',
 '994',
 '9th',
 '__',
 'aaron',
 'aba',
 'abandon',
 'abandoned',
 'abandoning',
 'abcs',
 'abet',
 'abhorrent',
 'abide',
 'abides',
 'abiding',
 'abilities',
 'ability',
 'able',
 'ably',
 'abolish',
 'abraham',
 'abridgement',
 'abroad',
 'abrogation',
 'absence',
 'absent',
 'absentee',
 'absolutely',
 'absolve',
 'absorb',
 'absurd',
 'abundance',
 'abundant',
 'abuse',
 'abused',
 'abuses',
 'abusing',
 'abusive',
 'abysmal',
 'academic',
 'academically',
 'academics',
 'academy',
 'accede',
 'accelerated',
 'accept',
 'acceptable',
 'acceptance',
 'accepted',
 'accepting',
 'accepts',
 'access',
 'accessible',
 'accessing',
 'accession',
 'accessioning',
 'accessories',
 'accident',
 'accidents',
 'acclaimed',
 'accommodate',
 'accommodated',
 'accommodating',
 'accompanies',
 'accompanying',
 'accomplish',
 'accomplished',
 'accomplishes',
 'accomplishment',
 'accordance',
 'according',
 'accordingly',
 'account',
 'accountability',
 'accountable',
 'accountant',
 'accounting',
 'accounts',
 'accumulated',
 'accumulation',
 'accurate',
 'accurately',
 'accusations',
 'accused',
 'accustom',
 'achieve',
 'achieved',
 'achievement',
 'achievements',
 'achieving',
 'acknowledge',
 'acknowledged',
 'acknowledges',
 'aclu',
 'acquainted',
 'acquire',
 'acquired',
 'acquisition',
 'acquisitions',
 'acre',
 'acres',
 'acronym',
 'act',
 'acted',
 'acting',
 'action',
 'actionable',
 'actions',
 'activate',
 'active',
 'actively',
 'activities',
 'activity',
 'actor',
 'actors',
 'acts',
 'actual',
 'actually',
 'ada',
 'adamantly',
 'adams',
 'adc',
 'add',
 'added',
 'addiction',
 'adding',
 'addition',
 'additional',
 'additionally',
 'additions',
 'address',
 'addressed',
 'addresses',
 'addressing',
 'adds',
 'adequate',
 'adequately',
 'adhere',
 'adherents',
 'adhering',
 'adjacent',
 'adjourn',
 'adjournment',
 'adjudicated',
 'adjust',
 'adjusted',
 'adjustment',
 'adjustments',
 'administer',
 'administered',
 'administering',
 'administration',
 'administrations',
 'administrative',
 'administrator',
 'administrators',
 'admirable',
 'admire',
 'admission',
 'admit',
 'admitted',
 'admittedly',
 'admitting',
 'adolescence',
 'adopt',
 'adopted',
 'adopting',
 'adoption',
 'adoptions',
 'ads',
 'adult',
 'adults',
 'advance',
 'advanced',
 'advancement',
 'advancements',
 'advances',
 'advancing',
 'advantage',
 'advantaged',
 'advantages',
 'adventure',
 'adversary',
 'adverse',
 'adversely',
 'advertised',
 'advice',
 'advise',
 'advised',
 'advisor',
 'advisories',
 'advisory',
 'advocacy',
 'advocate',
 'advocated',
 'advocates',
 'aesthetic',
 'affairs',
 'affect',
 'affected',
 'affecting',
 'affects',
 'affiliated',
 'affiliation',
 'affirm',
 'affirmative',
 'affirmatively',
 'affirmed',
 'affirms',
 'affluent',
 'afford',
 'affordable',
 'afforded',
 'affording',
 'affront',
 'afghanistan',
 'afl',
 'aforementioned',
 'afraid',
 'africa',
 'african',
 'afscme',
 'aftermarket',
 'aftermath',
 'afternoon',
 'age',
 'aged',
 'agencies',
 'agency',
 'agenda',
 'agendas',
 'agents',
 'ages',
 'aggressively',
 'aggrieved',
 'ago',
 'agony',
 'agree',
 'agreed',
 'agreeing',
 'agreement',
 'agreements',
 'agrees',
 'agricultural',
 'agriculture',
 'aha',
 'ahead',
 'ahs',
 'aid',
 'aide',
 'aided',
 'aiding',
 'aim',
 'aimed',
 'aims',
 'air',
 'airing',
 'airline',
 'airplanes',
 'aisle',
 'ak',
 'akin',
 'akron',
 'al',
 'alabama',
 'alan',
 'alarm',
 'alarming',
 'alaska',
 'alaskan',
 'albany',
 'alcee',
 'aldebron',
 'alerted',
 'alexander',
 'alexandria',
 'alfred',
 'alice',
 'aliens',
 'align',
 'aligned',
 'aligns',
 'alike',
 'alive',
 'allegations',
 'allege',
 'alleged',
 'allegedly',
 'allegiance',
 'alleging',
 'alleviate',
 'alliance',
 'allied',
 'allocate',
 'allocation',
 'allocations',
 'allotment',
 'allotted',
 'allow',
 'allowable',
 'allowed',
 'allowing',
 'allows',
 'alluded',
 'almonds',
 'alphabet',
 'altamonte',
 'alter',
 'altered',
 'alternate',
 'alternates',
 'alternative',
 'alternatives',
 'alto',
 'amaze',
 'amazing',
 'ambassador',
 'ambulances',
 'ameliorate',
 'amend',
 'amendable',
 'amended',
 'amending',
 'amendment',
 'amendments',
 'america',
 'american',
 'americans',
 'amos',
 'amounting',
 'amounts',
 'amp',
 'ample',
 'amt',
 'analysis',
 'analyst',
 'analyze',
 'anathema',
 'anderson',
 'andrea',
 'andrews',
 'anecdotes',
 'angela',
 'angeles',
 'angry',
 'angst',
 'anguish',
 'animal',
 'animals',
 'animated',
 'ann',
 'anna',
 'annie',
 'annihilation',
 'anniston',
 'announce',
 'announced',
 'announcement',
 'annual',
 'annually',
 'anonymous',
 'ansje',
 'answer',
 'answered',
 'answers',
 'antagonizing',
 'antelope',
 'anthony',
 'anthrax',
 'anti',
 'anticipate',
 'anticipated',
 'anticipates',
 'antidumping',
 'antietam',
 'antiforum',
 'antimiscegenation',
 'antipathy',
 'antiquated',
 'antonio',
 'anxiety',
 'anxious',
 'anybody',
 'anymore',
 'anyplace',
 'anytime',
 'aoc',
 'apa',
 'apart',
 'apathy',
 'apologize',
 'apostle',
 'apparel',
 'apparent',
 'apparently',
 'appeal',
 'appealed',
 'appeals',
 'appear',
 'appeared',
 'appears',
 'appendices',
 'applaud',
 'appliance',
 'applicability',
 'applicable',
 'applicants',
 'application',
 'applied',
 'applies',
 'apply',
 'applying',
 'appoint',
 'appointed',
 'appointee',
 'appointing',
 'appointment',
 'appointments',
 'appreciably',
 'appreciate',
 'appreciated',
 'appreciates',
 'appreciation',
 'appreciative',
 'approach',
 'approached',
 'approaches',
 'appropriate',
 'appropriated',
 'appropriately',
 'appropriates',
 'appropriation',
 'appropriations',
 'appropriators',
 'approval',
 'approve',
 'approved',
 'approving',
 'approximately',
 'april',
 'aptly',
 'aquatic',
 'ar',
 'arab',
 'arabia',
 'arbitrary',
 'arc',
 'architect',
 'architects',
 'architectural',
 'architecture',
 'ardently',
 'ardmore',
 'area',
 'areas',
 'arena',
 'argentina',
 'argue',
 'argued',
 'arguing',
 'argument',
 'argumentative',
 'arguments',
 'arid',
 'arise',
 'arisen',
 'arising',
 'aristocracies',
 'aristocracy',
 'arizona',
 'arkansas',
 'arm',
 'armed',
 'armies',
 'armor',
 'arms',
 'army',
 'arnold',
 'arnolds',
 'arrange',
 'arrangements',
 'array',
 'arrays',
 'arrested',
 'arrests',
 'arrival',
 'arrived',
 'arrogance',
 'arrogant',
 'arsenal',
 'art',
 'article',
 'articles',
 'articulate',
 'artificial',
 'artificially',
 'arts',
 'ascertain',
 'asfe',
 'asian',
 'aside',
 'asides',
 'ask',
 'asked',
 'asking',
 'asks',
 'aspect',
 'aspects',
 'asphyxiating',
 'assault',
 'assemble',
 'assembly',
 'assert',
 'asserted',
 'assertion',
 'assertions',
 'assess',
 'assessed',
 'assessing',
 'assessment',
 'assessments',
 'assets',
 'assigned',
 'assignment',
 'assigns',
 'assimilating',
 'assist',
 'assistance',
 'assistant',
 'assisted',
 'assisting',
 'assists',
 'assoc',
 'associate',
 'associated',
 'associates',
 'association',
 'associations',
 'assume',
 'assumed',
 'assumes',
 'assuming',
 'assumption',
 'assumptions',
 'assurance',
 'assurances',
 'assure',
 'assured',
 'assures',
 'assuring',
 'asthma',
 'astounding',
 'astronaut',
 'astronomical',
 'athletic',
 'atkins',
 'atla',
 'atlanta',
 'atm',
 'atmosphere',
 'attach',
 'attached',
 'attaching',
 'attack',
 'attacked',
 'attacks',
 'attain',
 'attainable',
 'attaining',
 'attempt',
 'attempted',
 'attempting',
 'attempts',
 'attend',
 'attended',
 'attending',
 'attention',
 'attest',
 'attitude',
 'attorney',
 'attorneys',
 'attract',
 'audit',
 'audited',
 'auditing',
 'audits',
 'august',
 'austin',
 'australia',
 'authentic',
 'author',
 'authoring',
 'authorities',
 'authority',
 'authorization',
 'authorizations',
 'authorize',
 'authorized',
 'authorizes',
 'authorizing',
 'authors',
 'auto',
 'autocratic',
 'automatic',
 'automatically',
 'automobile',
 'automotive',
 'autonomy',
 'avail',
 'availability',
 'available',
 'avalanche',
 'avenue',
 'average',
 'aviation',
 'avoid',
 ...]

In [226]:
All_tokens = pd.DataFrame(speech_tokens.toarray(), columns=count_vectorizer.get_feature_names())

In [227]:
#All_tokens

Okay, it's far too big to even look at. Let's try to get a list of features from a new CountVectorizer that only takes the top 100 words.


In [228]:
count_vectorizer_100 = CountVectorizer(max_features=100, stop_words='english')

In [229]:
speech_tokens_top100 = count_vectorizer_100.fit_transform(speeches_df['content'])

Now let's push all of that into a dataframe with nicely named columns.


In [230]:
Top_100_tokens = pd.DataFrame(speech_tokens_top100.toarray(), columns=count_vectorizer_100.get_feature_names())
Top_100_tokens.head()


Out[230]:
000 11 act allow amendment america american amp association balance based believe bipartisan chairman children ... teachers thank think time today trade united urge vote want way work year years yield
0 0 1 3 0 0 0 3 0 0 0 0 1 0 3 0 ... 0 1 3 3 2 0 1 0 0 1 1 0 0 0 1
1 0 0 1 1 1 0 0 0 0 1 0 0 0 2 0 ... 0 1 0 2 2 0 0 0 1 1 3 0 1 0 0
2 0 0 0 0 0 0 1 0 0 0 0 0 0 2 0 ... 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1
3 0 0 0 0 0 1 0 0 0 1 0 0 0 2 0 ... 0 0 0 2 0 0 1 0 1 1 1 0 0 0 0
4 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 ... 0 1 0 1 0 0 0 0 2 0 0 0 0 0 2

5 rows × 100 columns

Everyone seems to start their speeches with "mr chairman" - how many speeches are there total, and many don't mention "chairman" and how many mention neither "mr" nor "chairman"?


In [231]:
speeches_df.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 702 entries, 0 to 701
Data columns (total 3 columns):
content     702 non-null object
filename    702 non-null object
pathname    702 non-null object
dtypes: object(3)
memory usage: 16.5+ KB

In [237]:
Top_100_tokens['No_chairman'] = Top_100_tokens['chairman'] == 0
Top_100_tokens[Top_100_tokens['No_chairman'] == True].count().head(1)


Out[237]:
000    250
dtype: int64

In [238]:
Top_100_tokens['no_mr'] = Top_100_tokens['mr'] == 0
Top_100_tokens[Top_100_tokens['no_mr'] == True].count().head(1)


Out[238]:
000    79
dtype: int64

What is the index of the speech thank is the most thankful, a.k.a. includes the word 'thank' the most times?


In [239]:
Top_100_tokens['thank'].sort_values(ascending=False).head(1)


Out[239]:
577    9
Name: thank, dtype: int64

If I'm searching for China and trade, what are the top 3 speeches to read according to the CountVectoriser?


In [240]:
Top_100_tokens['china trade'] = Top_100_tokens['china'] + Top_100_tokens['trade']

In [241]:
Top_100_tokens['china trade'].sort_values(ascending=False).head(3)


Out[241]:
379    92
399    36
345    27
Name: china trade, dtype: int64

Now what if I'm using a TfidfVectorizer?


In [247]:
idf_vectorizer = TfidfVectorizer(stop_words='english', use_idf=True)
Top_100_tokens_idf = idf_vectorizer.fit_transform(All_speeches)
idf_df = pd.DataFrame(Top_100_tokens_idf.toarray(), columns=idf_vectorizer.get_feature_names())
idf_df['china trade'] = idf_df['china'] + idf_df['trade']

In [248]:
idf_df['china trade'].sort_values(ascending=False).head(3)


Out[248]:
402    0.909362
345    0.863658
317    0.857680
Name: china trade, dtype: float64

What's the content of the speeches? Here's a way to get them:


In [251]:
# index 0 is the first speech, which was the first one imported.
paths[402]


Out[251]:
'convote_v1.1/data_stage_one/development_set/421_400387_2010045_DMN.txt'

In [253]:
# Pass that into 'cat' using { } which lets you put variables in shell commands
# that way you can pass the path to cat
!cat {paths[577]}


mr. chairman , i just wanted to remind the house that faith-based organizations can and do sponsor federally funded head start programs . 
any sponsor who will agree not to discriminate in employment , if they can sponsor a program with the discrimination amendment , they can sponsor the program without that amendment if they would agree not to discriminate . 
what we are talking about is discrimination . 
some people want to discriminate against catholics , jews , muslims , african americans . 
we had this discussion in the 1960s , and the consensus back then was that discrimination in employment was so offensive that we made it illegal . 
the victim needs to be protected and the weight of the federal government will fall down on the side of the victim . 
the vote was not unanimous . 
some people did not like it then ; they do not like it now . 
and we are discussing where should the weight of the government be , with the victim or with somebody trying to discriminate . 
this is head start . 
we should not give students of head start the idea that their parents were denied a federally funded job solely because of their religion . 
we have heard of the supreme court . 
all of the supreme court decisions have said it is okay for a church to discriminate in employment with church money . 
none have supported discrimination with direct federal funding . 
we have heard of our forefathers in 1964 . 
we know that since 1965 it has been illegal , at least until this administration , to discriminate with federal money . 
head start has been reauthorized for over 40 years with the civil rights protections . 
president clinton 's name has been invoked . 
what is left out is his signing statement where he said that his analysis was that they could not discriminate with the federal money under his analysis . 
this administration has changed that analysis , but we need to make sure that president clinton 's whole signing statement is included . 
mr. chairman , i submit for printing in the record letters from numerous organizations including the national head start association which oppose the discrimination amendment and ask us to vote `` no '' on the underlying bill if they sabotage civil rights protections . 
september 22 , 2005 . 
dear member of congress : i have become aware that an amendment has been offered by rep . 
boustany ( r-la ) to the head start bill on the house floor today that would give faith-based organizations providing head start services the right to discriminate with federal funds against employees who are of different faiths . 
as the state president of the louisiana head start association , i strongly oppose such an amendment . 
it is a sad day when members of congress try to manipulate compassion evoked by the national tragedy in my state of louisiana caused by katrina to pass a civil rights repeal in head start or jeopardize the passage of this law so important to the children of my state and our nation . 
i know , firsthand , that head start is a model for demonstrating that a strong prohibition on religious employment discrimination with federal funds is fully compatible with federal assistance to faith-based charities . 
faith-based organizations , like the ones i oversee , can and do fully participate in federally funded programs without discriminating in hiring with those same federal funds . 
i see no reason to change the law to allow them to use federal funds to discriminate against our employees . 
my state 's religiously affiliated providers are more than capable and willing to honor the civil rights requirements of the head start program . 
i am greatly concerned that the provision to remove civil rights protections for employees could have a negative impact on the children and families who participate in these programs . 
tens of thousands of at-risk 3- and 4-year-old children currently in head start could lose their teachers -- who often are the most important adults to whom they have bonded , other than their parents -- not because those teachers are doing a bad job , but because they are the `` wrong '' religion . 
as the state president of the louisiana head start association , i urge you to reject the boustany amendment to allow discrimination in head start . 
such a provision is incompatible with the mission of this program . 
sincerely , & lt ; center & gt ; barbara pickney , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; st . 
landry parish head start program , state president of the louisiana head start association . 
& lt ; /em & gt ; national head start association , alexandria , va , september 19 , 2005 . 
dear chairman boehner and ranking member miller : on behalf of the more than 2.5 million children and families , program staff and volunteers that comprise the head start and early head start community , we are writing to you today to address certain issues regarding the reauthorization of the head start act . 
we appreciate the bi-partisan spirit that has occurred throughout this crafting of the reauthorization bill . 
h.r. 2123 does not contain the controversial block grant proposal of the 108th congress and maintains the crucial comprehensive services of the head start program performance standards . 
we applaud a number of measures and improvements incorporated into this bill , such as enhanced homeless outreach ; greater set asides for migrant and seasonal workers and native americans , as well as early head start programs ; and the addition of a `` seamless service '' provision that allows programs to convert head start slots to early head start slots under certain circumstances . 
while the recompetition provision is not perfect , we appreciate that its intent is not to recompete all programs , but to recompete only failing programs . 
we also acknowledge that the teacher requirements are based on national goals and that training and technical assistance is funded at two percent , with 50 percent of that amount going directly to programs . 
while we generally are pleased with the overall intent and direction of h.r. 2123 , we do have continuing concerns about certain specific provisions that we hope that can be resolved before the bill is enacted into law . 
these concerns are discussed in greater detail below . 
recompetition procedures , which are laid out in detail in section 641 ( c ) ( 1 ) - ( 19 ) include several areas that are problematic . 
while we strongly agree that programs that are not providing high quality services should have to recompete for head start funds , we are concerned that the language in this section may force more programs -- regardless of quality -- to undergo recompetition . 
we believe that there should be a strong message that all programs must be high performing . 
yet , we also believe that programs that are providing high quality services should not be put in the position of recompeting every five years , as this instability makes it difficult for them to recruit and retain the best teachers , to invest in facilities , and to create lasting partnerships with other community agencies . 
while we appreciate the efforts to make the recompetition process fair , there remains a very long list of tests that must be met to determine the priority status of programs . 
we continue to have concerns that some of these tests could be evaluated in an arbitrary manner , throwing programs into a recompete status , regardless of their performance . 
the head start community does not want to see failing programs continue , but we would like reassurances that the recompetition process will be unbiased and consistent in its application by the bureau . 
to achieve this , we would prefer that there be more limited parameters to determine the need to recompete a grantee , such as programs that have unresolved areas of noncompliance . 
the entire head start community is committed to raising the bar when it comes to improving quality and enhancing teacher and staff credentials . 
additionally , educational levels among head start teachers have increased appreciably since the 1998 congressional mandate to increase the proportion of head start teachers with an a.a . 
degree . 
fifty-seven percent of head start teachers had at least an a.a . 
degree in 2003 , exceeding a congressional mandate that 50 percent of head start teachers in center-based classrooms attain an a.a . 
degree or higher by september 2003 . 
most head start teachers without degrees were working toward them . 
fifty-eight percent of head start teachers without a degree or credential were enrolled in an early childhood education or related degree program , and 18 percent were in child development associate ( cda ) or equivalent training . 
a key to head start 's success in meeting the 1998 mandate was that congress also increased funding , which provided scholarships , release time and qualified substitutes , teacher salary increases , and other quality enhancement supports . 
the 1998 law required that , when funding for the program increased , a certain percentage of new dollars would be dedicated to quality . 
in the following years , funding for the head start program grew and , as a result , funds available for quality activities increased . 
however , head start funding has not kept pace with inflation in recent years , so programs no longer have a growing source of funds to help teachers attain degrees . 
additional funding will be needed to meet a programs must have the resources to help teachers gain their credentials and to pay salaries at a high enough level to recruit and retain teachers with the required degree . 
without new money for teacher salaries , increased credentialing for teachers should not be mandatory . 
while we appreciate the modifications made in committee markup to the provisions regarding the head start parent policy councils , we strongly believe in the integral and shared responsibilities of board members and parents in head start governing bodies . 
the high degree of parental involvement in the head start program has provided a role model for early childhood education for 40 years . 
the head start community is fully committed to restoration of the current level of authority to parent policy councils . 
the nrs , a pre- and post-test for head start children , is not a valid measurement of program impact and should not be used in this manner . 
because head start serves children with very high level needs , using this kind of measure to evaluate programs may well penalize those programs serving the children with the greatest needs . 
further , as pointed out in a may 2005 general accountability office report , the nrs was found to be invalid and unreliable . 
the gao also confirmed that the nrs is not an appropriate evaluation vehicle for children who are english language learners , especially those who speak neither english nor spanish . 
additionally , we know that the head start bureau is spending more than $ 21 million annually on the nrs , an expenditure that does not even begin to take into consideration the costs of preparing for and administering the test at the program level . 
we ask the house of representatives to suspend further use of and expenditures for the nrs until the national academy of sciences can make the test scientifically valid . 
h.r. 2123 contains a provision that the head start community believes is punitive and unreasonable to all head start programs . 
the process and planning that is required of program administrators for a full prism review can not be performed overnight . 
the head start community has no objection to unannounced site visits when they concern health and safety issues or are following up on prior compliance matters . 
nhsa believes that a minimum of 30 days notice should be required of the head start bureau before full prism reviews . 
high quality training is critically important to improving and sustaining head start quality and childhood outcomes . 
h.r. 2123 limits the ability of parents and staff to travel in order to receive specialized training and career development at national conferences . 
this is an unnecessary provision that will cause confusion for program administrators since the existing grant application process requires justification of all training . 
while the head start community strives for sound collaboration with their respective state officials , it is critically important that state officials reciprocate in these collaborative efforts . 
h.r. 2123 does not require input as it should , and as is now required , from state head start officials in the process of selecting staff who will have coordination responsibilities . 
the head start community believes that state head start associations should have sign-off on candidates for state collaboration officers , as well as continuing involvement in the planning and implementation of state plans . 
furthermore , there should be clarification regarding states that have existing state advisory councils , namely that they are permitted to modify them to meet the requirements in the bill . 
the head start community , including a number of programs administered by religious organizations , strongly opposes any effort by this administration to encourage religious discrimination in hiring practices for head start or any federally-funded program . 
freedom of religion , a cornerstone of this great nation , should be sacrosanct to all of us . 
it is incomprehensible that the u.s. congress would tamper with the ability of its citizens to practice their faith by using the threat of employment discrimination . 
in spite of its positive provisions , if h.r. 2123 contains a religious discrimination amendment , we must reluctantly oppose the bill . 
in closing , we commend the education and workforce committee for their bi-partisan efforts in this head start reauthorization bill and we hope that modifications will be made that will result in improvements to the program . 
sincerely , ministers in action , washington , dc , september 16 , 2005 . 
dear member of congress : as pastors and leaders of predominately african american congregations across the country , we urge you to stand up for the civil rights and religious freedom of all americans , and to maintain the bipartisan direction of the school readiness act ( h.r. 2123 ) by opposing any attempt to repeal longstanding critical civil rights protections on the house floor . 
this bill maintains provisions designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded head start programs . 
we have continually supported these provisions because this is consistent with our commitment to protecting the as religious figures we provide leadership grounded by theological interpretations of scripture , and focus on issues of concern to our parishioners and our community . 
we agree that religious organizations participating in the head start program make an invaluable contribution to the education of thousands of students in minority communities in particular , but do not agree that discriminating against persons based upon their religion is necessary or desirable in order to provide these much needed services . 
we are optimistic that this bill can gain broad support among religious , civil rights , labor , education , health , and advocacy organizations , but this broad support will end if there is any threat to remove the longstanding critical civil rights protections in head start . 
in particular , we are seriously concerned about a statement released by the committee on education and the workforce on may 5 , 2005 , in which chairman boehner stated that he foresees an amendment on the house floor to rollback longstanding critical civil rights protections . 
in light of this statement , we are asking members to oppose this amendment and not support the head start bill if the anti-discrimination provisions are removed . 
as leaders of our respective congregations we are committed to providing much needed services in our communities and have done so by respecting the rights of all individuals . 
therefore , we find it particularly insulting to suggest that it is necessary to remove civil rights protections from head start programs in order for this outreach to continue . 
furthermore , we can not compromise our principles by supporting a program that allows organizations , including religiously-affiliated organizations , to discriminate with federal taxpayers ' dollars . 
we urge you to maintain the bipartisan direction of the school readiness act ( h.r. 2123 ) and to not support any agreement that allows for an assault on civil rights protections in federally-funded programs , especially a program as critical as head start . 
this could destroy the mutually supported nature of the head start program in which the education of young children -- especially minority children -- is so dependent upon parental participation and on ongoing , close relationships with head start teachers . 
uplifting our surrounding community does not require the concurrent advancement of government funded discrimination . 
sincerely , reverend timothy mcdonald , anti-defamation league , new york , ny , september 16 , 2005 . 
dear representative : on behalf of the anti-defamation league , we write to urge you to maintain the civil rights protections currently included in the house education and the workforce-approved version of the school readiness act ( h.r. 2123 ) -- and to oppose any efforts to repeal these important provisions . 
allowing religious-based employment discrimination in federally-funded programs is wrong -- and to do it on the historic head start anti-poverty education program is deeply offensive . 
since 1972 , agencies that receive government funding for head start -- including religious organizations and houses of worship that host head start programs -- have been prohibited from discriminating on the basis of religion when hiring or firing staff within the federally-funded program . 
these existing non-discrimination requirements have a history of bipartisan support , and were originally signed into law by president richard nixon . 
the current anti-discrimination language was included in the 1981 head start reauthorization bill , signed into law by president ronald reagan , and has been included in every head start reauthorization since then -- in 1984 , 1986 , 1990 , 1994 , and 1998 . 
for 33 years , these fundamental non-discrimination protections have worked well , allowing thousands of head start programs in communities throughout the country to flourish while maintaining constitutional and civil rights safeguards against religious tests for employment in federally-funded programs . 
we have great appreciation for the vital role religious institutions have historically played in addressing many of our nation 's most pressing social needs , as a critical complement to government-funded programs . 
for decades , government-funded partnerships with religiously-affiliated organizations -- such as catholic charities , jewish community federations , and lutheran social services -- have helped to combat poverty and provided housing , education , and health care services for those in need . 
these successful partnerships have provided excellent service to communities , largely unburdened by concerns over bureaucratic entanglements between government and religion . 
indeed , at the same time that safeguards have protected beneficiaries from unwanted and unconstitutional the house has never voted to repeal existing civil rights protections in a floor amendment . 
to do so on head start , an historic anti-poverty program universally acclaimed and present in so many communities across the country , is odious . 
we urge you to oppose any attempt to remove civil rights protections from head start . 
sincerely , & lt ; center & gt ; michael lieberman , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; washington counsel. & lt ; /em & gt ; & lt ; center & gt ; jess n. hordes , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; washington director. & lt ; /em & gt ; american federation of state , county and municipal employees , afl-cio , washington , dc , september 20 , 2005 . 
dear representative : on behalf of the 1.4 million members of the american federation of state , county and municipal employees ( afscme ) , i am writing with respect to certain provisions of h.r. 2123 which would reauthorize the head start program . 
we want to express our sincere appreciation for the bi-partisan and inclusive process that resulted in unanimous approval of the legislation at the committee level . 
significantly , h.r. 2123 does not include the controversial block grant proposal that derailed efforts to reauthorize head start in the last congress . 
rather , h.r. 2123 respects and maintains the crucial comprehensive services of the program performance standards that long have marked head start as a program of distinction . 
we believe that h.r. 2123 , with some changes , has the very real potential to build upon the success of head start for future generations . 
however , we are concerned that this bill does not address the low pay offered to head start teachers and staff and the lack of financial assistance in meeting new and more rigorous educational requirements . 
we support h.r. 2123 's focus on raising standards for head start teachers , including the provision calling for 50 percent of all current head start teachers to have a bachelor 's degree within five years and all new head start teachers to have an associate 's degree . 
however , the estimated cost of the additional education for half of all head start teachers to earn bachelor 's degrees by 2008 is approximately $ 2 billion over five years . 
if we want quality education for head start children , we must be willing to help teachers achieve this important goal . 
afscme members have worked in head start programs for decades . 
we know that the qualifications of early childhood educators matter because high quality early education improves outcomes for children and delivers benefits to the community that far outweigh the costs . 
we are also deeply concerned that chairman boehner intends to offer a controversial amendment on the floor to repeal longstanding civil rights protections from the head start program . 
allowing federally-funded discrimination in any program is immoral . 
but it is especially egregious given that the civil rights protections in head start are an integral part of its mission to provide families a ladder out of poverty by encouraging parents to become volunteers and then teachers . 
denying a parent economic opportunity because of the religion he/she practices violates the principles upon which our country was founded . 
we strongly urge you to oppose the amendment . 
if the amendment is adopted , afscme urges you to oppose the bill on final passage . 
sincerely , charles m. loveless , on civil rights , washington , dc , september 16 , 2005 . 
dear representative : on behalf of the leadership conference on civil rights ( lccr ) , the nation 's oldest , largest , and most diverse civil and human rights coalition , with more than 190 member organizations , we urge you to oppose the boehner amendment or any amendment to the school readiness act ( h.r. 2123 ) that would repeal longstanding civil rights protections in the head start program that have been in place since president nixon signed the law in 1972 . 
we strongly oppose any language that would allow federally-funded employment discrimination . 
if language repealing civil rights protections is added to the bill during consideration on the house floor , we urge you to oppose final passage of h.r. 2123 . 
lccr opposes allowing government-funded employment discrimination . 
religious organizations have always served as key partners in providing government services through the head start program and current law has not been a hindrance to their vigorous participation . 
there also is no controversy over the exemption under title vii of the civil rights act of 1964 that allows religious organizations to have a preference of hiring co-religionists when they are using private funds , but federal funds may not be used to discriminate . 
such a drastic change to the current head start program would be inconsistent with the long held notion that federal dollars must not be used to discriminate . 
the boehner amendment would allow government-funded employment discrimination , although the u.s. supreme court affirmed the title vii exemption for privately-funded religious employers , it did not authorize federally-funded employment discrimination . 
see corporation of presiding bishop of church of jesus christ of latter day saints v. amos , 483 u.s. 327 ( 1987 ) . 
we believe , based on analysis of amos , that if federal funds are used by religious organizations to hire only persons of their own faith , then the federal government is affirmatively acting to advance employment discrimination . 
in the 60 years since franklin d. roosevelt signed the first executive order prohibiting discrimination in federally funded activity , our nation has made significant progress in the struggle to end employment discrimination and advance equality . 
any attempt to allow organizations to discriminate on the basis of religion with federal funds would drastically impede that progress and erode a longstanding principle of our nation 's civil rights policy : that federal civil rights obligations follow federal dollars , regardless of who receives them . 
the courts have affirmed the principle that federal funds can not be used to discriminate . 
the leading case on the question of government-aided discrimination is norwood v. harrison , 413 u.s. 455 ( 1973 ) . 
in a unanimous decision , the u.s. supreme court held that `` the constitution does not permit the state to aid discrimination. '' id . 
465-66 . 
the principles set out in norwood were affirmed in justice o'connor 's opinion in city of richmond v. j.a . 
croson co . 
488 u.s. 469 , 492 ( 1989 ) , which stated , lccr urges you to oppose rep . 
boehner 's amendment because current law must not be changed to allow recipients of head start funds to have an explicit statutory right to engage in employment discrimination . 
if this amendment passes , or other language is added during floor consideration that repeals current law , lccr urges you to oppose final passage of h.r. 2123 . 
if you have any questions , please contact nancy zirkin , lccr deputy director , or andrea martin , senior counsel and policy analyst regarding this or any issue important to lccr . 
sincerely , & lt ; center & gt ; wade henderson , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; executive director. & lt ; /em & gt ; & lt ; center & gt ; nancy zirkin , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; deputy director. & lt ; /em & gt ; washington bureau , national association for the advancement of colored people , washington , dc , september 19 , 2005 . 
dear member : on behalf of the national association for the advancement of colored people ( naacp ) , our nation 's oldest , largest and most widely recognized grassroots civil rights organization , i am writing today to urge you to do all you can to ensure that the longstanding , critical civil rights protections that are contained in the current version of h.r. 2123 , the school readiness act , are retained during consideration by the full house of representatives . 
specifically , i urge you to reject and work against the anticipated boehner amendment , which will repeal existing , long-standing head start provisions that prohibit religious organizations and churches from discriminating on the basis of religion when hiring or firing staff from positions within this federally-funded program . 
h.r. 2132 , as approved by the committee on education and labor , maintains provisions designed to protect the more than 198 , 000 head start teachers , staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded head start programs . 
the naacp again urges you to do all you can to maintain these vital protections throughout the legislative process , and that you do not support this legislation if , at any point they are stripped . 
the critical longstanding nondiscrimination provisions have been included in head start legislation since 1981 . 
this is a fundamental civil rights protection against employment discrimination for head start teachers and volunteers . 
the legislation has always received strong bipartisan support from both the house and senate since its enactment in the 97th congress when president ronald reagan signed the legislation into law . 
the twenty-four year old civil rights provision has worked well since the inception of this program , allowing religious organizations to participate in programs while maintaining constitutional and civil rights standards . 
the naacp both recognizes and celebrates that religious organizations participating in the head start program have made and continue to make an invaluable contribution to the education of thousands of students . 
these religious organizations have complied with head start 's existing civil rights requirements . 
however , if the repeal of the existing civil rights protections were to become law , teachers or parent volunteers working in any head start program run by a religious organization could immediately lose their jobs because of their religion . 
students participating in head start therefore could lose not only their teachers , but also the close programmatic connection with their own parents volunteering in the program . 
the naacp strongly believes that allowing discrimination based on religion thus , i urge you again , in the strongest terms possible , to support the continued inclusion of these longstanding and critical civil rights protections . 
the head start program is too critical to our children and our nation 's future to allow support for it to be divided by this issue . 
should you have any questions about the naacp position or if there is any way in which i can be of help to you as you move this reauthorization through the legislative process , i hope that you will feel free to contact me . 
thank you very much for you attention to the views of the naacp . 
sincerely , hilary o. shelton , the american jewish committee , washington , dc , september 19 , 2005 . 
dear representative : on behalf of the american jewish committee , the nation 's oldest human relations organization , with 33 chapters nationwide representing over 150 , 000 members and supporters , i urge you to oppose any amendments to the school readiness act , h.r. 2123 , that roll back crucial civil rights safeguards . 
further , if such an amendment is adopted , i urge you to oppose passage of h.r. 2123 ; repealing this longstanding essential element of head start could subject teachers in these federally-funded programs to religious discrimination . 
as passed out of the house education and the workforce committee , the bill maintains three-decade-old provisions that prohibit various forms of employment discrimination in head start . 
both religious and secular organizations have operated effectively under this system since it passed as part of bipartisan legislation passed during the 9th congress . 
ever since president richard nixon signed the legislation into law in 1972 , religion-based and other forms of discrimination are prohibited in head start programs , thereby ensuring that taxpayer dollars do not underwrite positions for which religion is a factor in hiring decisions . 
at the same time , the existing provisions do not intrude on the autonomy of religious organizations with respect to hiring decisions made in purely private programs . 
the efforts of the house education and the workforce committee to produce a bipartisan package are to be commended . 
the bill that reaches the house floor has the potential to receive broad support among religious , civil rights , labor , education , and health organizations . 
however , the bill risks losing critical segments of this support if , at any point , this initiative is amended to roll back head start 's longstanding civil rights protections by exempting religious organizations from the prohibition on religious discrimination in employment decisions . 
if so amended , h.r. 2123 would compromise an extremely successful program that provides essential services to nearly one million at-risk children nationwide . 
while many of the religious organizations that deliver the program would , no doubt , continue to hire employees for head start programs without regard to religion , h.r. 2123 could jeopardize the jobs of many thousands of current and potential teachers , staff , and parent volunteers for belonging to the `` wrong '' religion , as well as jeopardize children for whom a stable and trusting relationship between teacher and child is so important . 
for these reasons , we strongly urge you to oppose any attempts to roll back the vital civil rights protections of h.r. 2123 , the school readiness act . 
thank you for considering our views on this important matter . 
respectfully , richard t. foltin , of church and state , washington , dc , september 19 , 2005 . 
dear representative : americans united for separation of church and state urges you to oppose any amendment to repeal longstanding , critical civil rights protections contained in the school readiness act ( h.r. 2123 ) and to vote `` no '' on final passage of the bill if such an amendment is adopted . 
americans united represents more than 75 , 000 individual members throughout the fifty states , 9500 clergy nationwide , as well as cooperating houses of worship and other religious bodies committed to the preservation of religious liberty . 
h.r. 2123 unanimously passed out of the committee on education and the workforce on may 18 , 2005 , maintaining a longstanding civil rights provision designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded head start programs . 
we are pleased with this bipartisan legislation thus far , but are deeply concerned about stated threats to repeal longstanding civil rights protections against religious discrimination in our nation 's head start programs on the house floor . 
specifically , chairman boehner , after championing the committee-passed bill , stated that an amendment may be offered on the house floor that would repeal these protections . 
we urge you to reject attempts to sabotage a bipartisan effort to reauthorize the america 's head start programs with such a divisive anti-civil rights amendment . 
we recognize that religious organizations participating in the head start program make an invaluable contribution to the education of thousands of children . 
these organizations have complied with head start 's existing civil rights requirements without controversy . 
however , if the repeal of the existing civil rights protection were to become law , teachers or parent volunteers working in any head start program run by a religious organization could immediately lose their jobs simply because of their religion or religious beliefs . 
this would directly work against the stated goals of head start and could change the fundamental character of this tremendously successful program . 
according to the latest study from the national head start association , the program currently enjoys a soaring 96 percent parental satisfaction rate . 
the parents and communities that rely on head start programs should not have to choose between the renewal of the head start program and longstanding civil rights protections that are a cornerstone of this invaluable program . 
we hope that the house will continue the bipartisan goal of reauthorizing our nation 's head start programs and reject any attempts to roll back the civil rights protections long afforded to head start teachers and staff . 
if you have any questions about h.r. 2123 or would like further information on any other issue of importance to americans united , please contact aaron d. schuham , legislative director . 
sincerely , rev . 
barry w. lynn , for religious liberty , washington , dc , september 16 , 2005 . 
dear representative , the school readiness act of 2005 ( h.r. 2123 ) will soon be considered in the house . 
we write to urge you to oppose any effort to amend this bipartisan bill in a manner that would repeal current protections against religious discrimination . 
the current bill , passed out of committee with unanimous approval , maintains these important protections . 
unfortunately , repeated public statements have assured plans for a floor amendment that would allow religious discrimination in federally funded positions . 
we ask you to oppose any such amendment and to oppose final passage of the bill if the amendment were to pass . 
a recent hearing in the subcommittee on criminal justice , drug policy and human resources examining the faith-based initiative demonstrated once again that employment discrimination with federal dollars is one of the initiative 's most controversial and divisive elements . 
testimony indicated that the continued pursuit of such a rule change is often more about politics than good policy . 
head start should not be hijacked to promote such an unnecessary and unwise policy . 
religious organizations and the government have long worked in partnership to perform important social services . 
such partnerships are common for head start programs . 
we support these efforts and recognize the importance of government and religious cooperation generally . 
such cooperation has occurred for many years without the danger of government sponsored religious discrimination that is present in the proposed amendment . 
it would be extremely unwise to allow such a dramatic change in policy to threaten the reauthorization of head start . 
we appreciate your attention to this issue and urge you to oppose any proposal that would allow religious employment discrimination in government funded programs . 
sincerely , k . 
hollyn hollman , american civil liberties union , washington , dc , september 19 , 2005 . 
dear representative : the american civil liberties union strongly urges you to oppose any amendment to repeal longstanding critical civil rights protections contained in the school readiness act ( h.r. 2123 ) and vote `` no '' on final passage if such an amendment is adopted when the bill comes to the floor later this week . 
as unanimously passed out of the committee on education and the workforce , h.r. 2123 maintains longstanding provisions designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded positions in head start programs . 
the civil rights protections afforded to head start teachers and staff are essential and should not be repealed . 
proposed amendment to h.r. 2123 would repeal longstanding civil rights law that was never controversial we are pleased that the committee-passed head start legislation maintains longstanding critical civil rights protections . 
however , we are troubled by the threat of repealing these protections on the house floor . 
in a statement released by the committee on education and the workforce on may 5 , 2005 , the day h.r. 2123 was introduced , chairman boehner stated that he foresaw an amendment on the house floor to roll back longstanding critical civil rights protections . 
current law prohibits participants in head start programs from discriminating based on race , creed [ religion ] , color , national origin , sex , political affiliation or beliefs , or disability . 
42 u.s.c . 
9849 . 
if amended , h.r. 2123 would allow taxpayer dollars to fund religious organizations that discriminate against head start teachers and parent volunteers in federally-funded head start classrooms . 
the civil rights provision barring federally-funded religious discrimination has never been controversial . 
in fact , the provision was first included in head start legislation that was signed by president richard nixon and subsequently by president ronald reagan . 
throughout its 33-year history , the civil rights provision has not been an obstacle to the participation of religiously-affiliated organizations in head start programs . 
in fact , many religiously-affiliated organizations participate in head start and comply with the same civil rights provision that applies to everyone else . 
the proposed amendment to h.r. 2123 would reverse the government 's long fight against federally-funded discrimination repealing critical civil rights protections in head start attacks the very core of civil rights protections historically supported by the federal government . 
more than 60 years ago , the first success of the modern civil rights movement was a decision by president franklin roosevelt to bar federal contractors from discriminating based on race , religion , or national origin . 
from that first presidential decision through the supreme court 's decision allowing the federal government to deny special tax advantages to bob jones university , which claimed a religious right to retain the tax benefits while pursuing racist practices , the federal government has made the eradication of federally-funded discrimination among its highest priorities . 
if amended , h.r. 2123 would allow a religious organization , such as bob jones university , that discriminates based on religion , to participate in federal head start . 
in a disturbing result , bob jones university could be denied tax benefits because of its racist policies toward its students , but could receive federal head start money under h.r. 2123 to discriminate against teachers and parent volunteers working in head start classrooms -- simply because the employees do not meet bob jones university 's religious tests . 
moreover , in the many religious organizations in which the adherents are all of a single race , the result of federally-funded religious discrimination will effectively be federal funds going to the employment of persons of a single race . 
the federal government clearly has a compelling interest in applying the head start act 's civil rights provision to everyone receiving federal funds -- including religious organizations seeking to discriminate on the basis of religion in hiring persons to work in head start . 
repealing critical civil rights protections prohibiting discrimination in employment would be inconsistent with the leading supreme court case on the use of federal funds by religious organizations that discriminate . 
in bob jones univ . 
v. united states , 461 u.s. 574 ( 1983 ) , the supreme court held that federal government could deny a religiously-run university tax benefits because the university imposed a racially discriminatory antimiscegenation policy . 
id . 
at 605 . 
the court decided that the federal government 's compelling interest in eradicating racial discrimination in education superceded any burden on the university 's religious exercise of enforcing a religiously-motivated ban on students interracial dating . 
id . 
at 604 . 
there is no meaningful difference between the government prohibiting tax benefits to organizations that discriminate based on race and the head start act 's statutory prohibition on discrimination based on religion in head start classrooms . 
in fact , the united states itself -- during the current administration -- squarely rejected the proposition that intentional religious discrimination gets less protection under the equal protection clause than race . 
in its october 26 , 2001 brief defending the religion prong of title vii from an eleventh amendment attack , the united states stated that `` [ c ] ontrary to defendant 's contention that the supreme court has `distinguished claims involving differential treatment on the basis of race and speech from those involving religion , ' there can be no doubt that the equal protection clause subjects state governments engaging in intentional discrimination on the basis of religion to strict scrutiny. '' brief of intervenor united states in endres v. indiana state police ( n.d . 
ind . 
oct . 
26 , 2001 ) ( brief is available on www.usdoj.gov ) . 
if critical civil rights protections are repealed , h.r. 2123 would be unconstitutional h.r. 2123 , if amended , would abet unconstitutional employment discrimination based on religion . 
the proposed amendment 's exemption of religious organizations from the prohibition on religious discrimination in the program is contrary to constitutional law , and will open the door to government-funded discrimination . 
proponents of allowing religious organizations to use federal funds to discriminate against their employees argue that their position is consistent with a provision in title vii of the civil rights act of 1964 that generally permits religious organizations to prefer members of their own religion when making employment decisions . 
however , that provision does not consider whether federally-funded religious groups can discriminate with federal taxpayer dollars . 
moreover , although the supreme court upheld the constitutionality of the religious organization exemption in title vii , corporation of presiding bishop v. amos , 483 u.s. 327 , 336-39 ( 1987 ) , the court has never considered whether it is unconstitutional for a religious organization to discriminate based on religion when making employment several courts have considered whether a religious organization can retain its title vii exemption after receipt of indirect federal funds , e.g. , siegel v. truett-mcconnell college , inc. , 13 f. supp.2d 1335 , 1344 ( n.d . 
ga . 
1994 ) ( clarifying that its decision permitting a religious university to invoke the title vii exemption is because the government aid is directed to the students rather than the employer ) , but only one federal court has decided the constitutionality of retaining the title vii exemption after receipt of direct federal funds , dodge v. salvation army , 1989 wl 53857 ( s.d . 
miss . 
1989 ) . 
in that decision , the court held that the religious employer 's claim of its title vii exemption for a position `` substantially , if not exclusively '' funded with government money in addition to causing the establishment clause violation cited by the court in dodge , h.r. 2210 would also subject the government and any religious employer invoking the right to discriminate with federal dollars to liability for violation of constitutional rights under the free exercise clause and the equal protection clause . 
although mere receipt of government funds is insufficient to trigger constitutional obligations on private persons , a close nexus between the government and the private person 's activity can result in the courts treating the private person as a state actor . 
rendell-baker v. kohn , 457 u.s. 830 ( 1982 ) . 
it is beyond question that the government itself can not prefer members of a particular religion to work in a federally-funded program . 
the equal protection clause subjects governments engaging in intentional discrimination on the basis of religion to strict scrutiny . 
e.g. , united states v. batchelder , 442 u.s. 114 , 125 n.9 ( 1979 ) ; city of new orleans v. dukes , 427 u.s. 297 , 303 ( 1976 ) . 
no government could itself engage in the religious discrimination in employment accommodated and encouraged by the proposed rule 's employment provision . 
thus , the government would be in violation of the free exercise clause and the equal protection clause for knowingly funding religious discrimination . 
of course , a private organization is not subject to the requirements of the free exercise clause and the equal protection clause unless the organization is considered a state actor for a specific purpose . 
west v. atkins , 487 u.s. 42 , 52 ( 1988 ) . 
the supreme court recently explained when there is a sufficient nexus between the government and the private person to find that the private person is a state actor for purposes of compliance with constitutional requirements on certain decisions made by participants in the government program : [ s ] tate action may be found if , though only if , there is such a `close nexus between the state and the challenged action ' that seemingly private behavior `may be fairly treated as that of the state itself. ' ... .. 
we have , for example , held that a challenged activity may be state action when it results from the state 's exercise of `coercive power , ' when the state provides `significant encouragement , either overt or covert , ' or when a private actor operates as a `willful participant in joint activity with the state or its agents ' . 
... .. 
brentwood academy v. tennessee secondary school athletic association , 121 s. ct . 
924 , ( 2001 ) ( citations omitted ) . 
the extraordinary role that the current administration -- and the amendment sponsors -- have taken in accommodating , fostering , and encouraging religious organizations to discriminate based on religion when hiring for federally-funded programs creates the nexus for constitutional duties to be imposed on the provider , in addition to the requirements already placed on government itself . 
the clear intent of this amendment to repeal the civil rights provision in the head start act is to encourage certain providers receiving federal funds to discriminate based on religion . 
the proposed amendment to h.r. 2123 provision allowing federally-funded religious discrimination is part of a growing pattern of congressional , presidential , and regulatory actions taken specifically for the purpose of accommodating , fostering , and encouraging federally-funded private organizations to discriminate in ways that would unquestionably be unconstitutional if engaged in by the federal government itself . 
for example , in december of 2002 , president bush signed executive order 13279 , which amended an earlier executive order , which had provided more than 60 years of protection against discrimination based on religion by federal contractors . 
the bush order provides an exemption for religious organizations contracting with the government to discriminate in employment although religious employers have the right under title vii to apply religious tests to employees , the constitution requires that direct receipt and administration of federal funds removes that exemption . 
in addition , the federal government itself has constitutional obligations to refrain from religious discrimination or from establishing a religion . 
h.r. 2123 , if amended , would fail to meet any of those constitutional mandates . 
for these reasons , the aclu strongly urges you to vote `` no '' on any proposed amendment to the head start reauthorization ( `` school readiness act '' -- h.r. 2123 ) that would create an unconstitutional loophole allowing federally-funded religious discrimination and to vote `` no '' on final passage if an amendment is adopted . 
thank you for your attention to this matter , and please do not hesitate to call terri schroeder at 202-675-2324 if you have any questions regarding this issue . 
very truly yours , & lt ; center & gt ; caroline fredrickson , & lt ; /center & gt ; & lt ; center & gt ; director . 
& lt ; center & gt ; terri schroeder , & lt ; /center & gt ; & lt ; center & gt ; senior lobbyist . 
national league of cities , washington , dc , september 21 , 2005 . 
dear committee member : on behalf of the 18 , 000 cities represented by the national league of cities ( nlc ) , i want to commend members of the education and workforce committee on the passage of bipartisan head start legislation , h.r. 2123 , the `` school readiness act of 2005. '' head start is critical to helping to alleviate the plight of children of the working poor . 
in particular , nlc strongly endorses the committee 's commitment not to include language that would preempt state and local employment laws thereby permitting discrimination in employment by government-funded faith-based social service providers . 
as you know , local governments have a long and rich history of working with faith-based organizations that predates the enactment of the charitable choice provision contained in the welfare-to-work act of 1996 . 
nlc is especially proud of the fact that cities across the nation have carefully helped faith-based groups deliver services to our constituents while respecting the boundaries of our constitution . 
permitting government-funded employment discrimination is the wrong way to encourage faith-based institutions that deliver social services to apply for public funding . 
simply put , any language that preempts local governments from protecting its residents from employment discrimination undermines the spirit and letter of title vii of the civil rights act and unnecessarily encourages litigation against municipalities . 
nlc asks members of the house of representatives to maintain the committee 's bipartisan direction and oppose any attempts to repeal longstanding anti-discrimination protections during deliberation on the house floor . 
thank you . 
very truly yours , donald j. borut , national education association , washington , dc . 
september 21 , 2005 . 
dear representative : on behalf of the national education association 's ( nea ) 2.7 million members , we would like to offer our views on the school readiness act of 2005 ( h.r. 2123 ) , scheduled for floor debate this week . 
overall , we believe the bill contains a number of positive provisions . 
however , we do have some concerns as outlined below . 
in particular , we strongly oppose any amendment to repeal civil rights protections for head start teachers , staff , and volunteers and will oppose the final bill if it does not contain these protections . 
votes associated with these issues may be included in the nea legislative report card for the 109th congress . 
nea believes that children 's learning begins well before they enter school , and that the transition to school must be founded on strong school readiness . 
head start has a long history of success in this arena , having provided high-quality early childhood education , health , social services , and parental involvement programs to more than 18.5 million low-income children between the ages of 3 and 5 since its creation in 1964 . 
given the critical importance of head start , we are particularly pleased that h.r. 2123 does not allow for block granting of head start funds to states . 
we are also pleased that the bill would align head start curricula with k-12 education while preserving the comprehensive nature of the head start program . 
we believe these provisions will support effective transitions for children 's learning and development and ensure that children will enter school ready to learn . 
at the same time , the proposal will provide continuity for children by retaining the essential parental involvement , nutrition , and other nonacademic features of head start . 
we do have some concerns with portions of h.r. 2123 as drafted as well as proposed amendments : civil rights protections . 
we are very pleased that h.r. 2123 maintains provisions designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded head start programs . 
we recognize the invaluable contributions of religious organizations participating in head start . 
however , we are deeply concerned that a repeal of civil rights protections could allow religious organizations participating in head start to fire teachers or parent volunteers based on their religion . 
we strongly believe that allowing discrimination based on religion would significantly impede the important goals of head start as well as send a damaging message to students . 
we urge your opposition to any amendment , including one expected to be offered by representative boustany , that would repeal civil rights protections for head start employees . 
professional development . 
we are very pleased that h.r. 2123 has a strong focus on early childhood educator professional development . 
we are concerned , however , that the bill would require teachers to have higher academic degrees , without providing for a substantial increase in funding either for professional development or compensation . 
we recommend addressing this concern , including by providing grants to help teachers meet the costs of earning their bachelor 's and associates degrees and/or increasing the salaries of those teachers who earn degrees in early childhood education . 
assessments . 
h.r. 2123 allows a study of , and recommendations on , appropriate assessments for young children . 
we would recommend that the national academy of sciences conduct a review of the national reporting system to ensure that the assessments are comprehensive , reliable , and that the results are used to improve student achievement . 
we also hope to work with you toward increasing funding authorization levels to ensure that head start can fully serve all eligible low-income children and their families . 
we thank you for your consideration of our views on these important issues . 
& lt ; center & gt ; diane shust , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; director of government relations . 
& lt ; /em & gt ; & lt ; center & gt ; randall moody , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; manager of federal policy and politics. & lt ; /em & gt ; american humanist association , washington , dc , september 16 , 2005 . 
dear representative : the american humanist association ( aha ) stands in opposition to any retrenchment of existing civil rights protections , and therefore opposes any specific attempt to reverse the nondiscrimination provisions currently in effect in the head start program . 
congressman john boehner ( oh ) has indicated his intent to roll back vital civil rights protections by introducing , on the house floor , an amendment to h.r. 2123 , the school readiness act . 
on behalf of the oldest and largest humanist organization in the nation , i ask you to oppose any such attempt to legalize discrimination with federal funds as you vote on the bipartisan head start reauthorization bill . 
there is no compelling reason to undo the civil rights protections in the head start program that president nixon signed into law in 1972 . 
if this 33 year old nondiscrimination policy were discarded , the head start reauthorization would permit religious organizations to use federal funds to discriminate on the basis of religion , even when engaging in purely secular early childhood education activities . 
not only would such a removal of employment discrimination safeguards hold significant potential harm for humanists , jews , muslims , buddhists , and others who hold minority lifestances , it would not address an existing problem . 
faith-based organizations have been partnering with the government to provide social services for many years without the need to bypass civil rights laws . 
humanists are particularly concerned about this potential amendment because many dedicated teachers and volunteers in the head start program would find themselves disenfranchised just because they do not happen to believe as others do . 
as a result , this bill will likely lose the existing support of many religious , civil rights , education , health , and advocacy organizations if congressman boehner 's amendment is adopted . 
as humanists we persistently oppose federal funding for discrimination , especially discrimination done on the basis of religion or lack thereof . 
if religious or secular organizations wish to utilize taxpayer dollars to operate on our government 's behalf , they must also abide by the standards set for public service . 
this is why i write to ask you to oppose any amendment to the legislation that would roll back these critical civil rights protections . 
if such an amendment is added to the bill , we strongly urge you to oppose final passage of the bill . 
should you have any questions about our position , please do not hesitate to contact roy speckhardt on our staff . 
sincerely , mel lipman , religious discrimination , september 19 , 2005 . 
dear representative : we , the undersigned religious , civil rights , labor , education , health , and advocacy organizations are writing to urge you to oppose any amendment to repeal longstanding critical civil rights protections contained in the school readiness act ( h.r. 2123 ) and vote `` no '' on final passage if such an amendment is adopted . 
as unanimously passed out of the committee on education and the workforce , h.r. 2123 maintains longstanding provisions designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally-funded positions in head start programs . 
the critical longstanding nondiscrimination provisions have been included in head start legislation since 1972 . 
this is a fundamental civil rights protection against employment discrimination for head start teachers and volunteers . 
the legislation always has received strong bipartisan support from both the house and senate since its enactment in the 92nd congress when president nixon signed the legislation into law . 
the 33 year old civil rights provision has worked effectively since the inception of this program , allowing religious organizations to participate in programs while maintaining constitutional and civil rights standards . 
we are pleased that the committee-passed head start legislation maintains longstanding critical civil rights protections . 
however , we are troubled by the threat of repealing these protections on the house floor . 
in a statement released by the committee on education and the workforce on may 5 , 2005 , the day h.r. 2123 was introduced , chairman boehner stated that he foresaw an amendment on the house floor to roll back longstanding critical civil rights protections . 
the civil rights protections afforded to head start teachers and staff are vital and should not be dislodged . 
we recognize that religious organizations participating in the head start program make an invaluable contribution to the education of thousands of students . 
these religious organizations have complied with head start 's existing civil rights requirements . 
however , if the repeal of the existing civil rights protections becomes law , teachers or parent volunteers working in any head start program run by a religious organization could potentially lose their jobs based only on their religion . 
students participating in head start therefore could lose not only their teachers , but also the close programmatic connection with their own parents volunteering in the program . 
we strongly believe that allowing discrimination based on religion would significantly impede the important goals of head start , send a damaging message to head start students , and harm their we urge you to maintain current law and reject any assault on civil rights protections in federally-funded programs , especially a program as critical as head start . 
if these longstanding critical civil rights protections are repealed we urge you to vote `` no '' on final passage of h.r. 2123 . 
the dismantling of civil rights will destroy the nature of a program in which the education of young children is so dependent on parent participation and on ongoing , close relationships with head start teachers . 
sincerely , african american ministers in action . 
american association of university women . 
american civil liberties union . 
american federation of state , county and municipal employees . 
american federation of teachers . 
american humanist association . 
american jewish committee . 
american jewish congress . 
american-arab anti-discrimination committee ( adc ) . 
americans for democratic action . 
americans for religious liberty . 
americans united for separation of church and state . 
baptist joint committee for religious liberty . 
central conference of american rabbis . 
children 's defense fund . 
church women united . 
communications workers of america . 
disciples justice action network ( disciples of christ ) . 
equal partners in faith . 
faith action network of people for the american way . 
gay , lesbian and straight education network . 
general board of church and society of the united methodist church . 
human rights campaign . 
international union , uaw . 
legal momentum ( formerly now legal defense ) . 
mexican american legal defense and educational fund ( maldef ) . 
national association of social workers . 
national center on domestic and sexual violence . 
national council of jewish women . 
national council of women 's organizations . 
national education association . 
national head start association . 
national mental health association . 
national organization of women . 
national pta . 
national women 's law center . 
omb watch . 
people for the american way . 
secular coalition for america . 
service employees international union . 
stop family violence . 
texas faith network . 
texas freedom network . 
the interfaith alliance/foundation . 
the secular coalition for america . 
union for reform judaism . 
unitarian universalist association of congregations . 
united church of christ justice & amp ; witness ministries . 
women of reform judaism . 
the interfaith alliance , washington , dc , september 16 , 2005 . 
dear representative : i write to you today as the president of the interfaith alliance , a nonpartisan , national grassroots organization dedicated to promoting the positive and healing role of religion in public life to oppose any amendment to repeal longstanding critical civil rights protections contained in the school readiness act ( h.r. 2123 ) and vote `` no '' on final passage if such an amendment is adopted . 
as unanimously passed out of the committee on education and the workforce , h.r. 2123 maintains longstanding provisions designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination based on religion in federally funded head start programs , as an organization whose membership is comprised of 150 , 000 people of faith and good will spanning 75 faith traditions , i can think of no reason to justify an attempt to roll back these longstanding civil rights and religious liberty protections . 
indeed , in a nation as intentionally and increasingly pluralistic as ours , built-in protections prohibiting religious discrimination in federally-funded programs represent a fundamental commitment towards a society that values the contributions and abilities of people of all faith traditions equally . 
religious organizations have had a long and proud history in their active participation in head start programs . 
for years , congregations have made substantial contributions to their communities with the existing workplace protections in place . 
if those in congress who seek to repeal these employment safeguards are successful , thousands of teachers and parent volunteers who have dedicated themselves to this program could find themselves no longer welcome at religiously-affiliated head start programs because they are of a different faith than the sponsoring organization . 
while the interfaith alliance is supportive of the right of sectarian organizations to hire based on religious preference for purposes of furthering their institutional ministry , we believe that houses of worship forfeit that right once they accept federal taxpayer dollars to implement social service programs that are intended to serve all . 
further , any attempt to politicize the head start program -- a federally sponsored preschool program conceived to meet the needs of disadvantaged children since 1965 -- through a floor amendment to add the highly controversial religious exemption language , is not only unnecessary , but a sad commentary on the state of those political leaders who seek to attach religious exemption language to every social service program that comes before the congress . 
the interfaith alliance is pleased with the bipartisan direction of the head start legislation however ; this bill will no longer be bipartisan if there is any attempt to roll back longstanding critical civil rights protections . 
the civil rights protections afforded to head start teachers and staff are vital and should not be dislodged . 
this bill has gained broad support among religious , civil rights , labor , education , health , and advocacy organizations , but that broad support will end if there is any threat to remove the longstanding critical civil rights protections in head start . 
if you need further information on our position on this matter , please do not hesitate to contact kim baldwin , director of public policy and voter education or preetmohan singh , senior policy analyst , at 202-639-6370 . 
sincerely , & lt ; center & gt ; rev . 
dr . 
c. welton gaddy , & lt ; /center & gt ; & lt ; center & gt ; & lt ; em & gt ; president , the interfaith alliance , pastor of preaching and worship , north minster baptist church ( monroe , la ) . & lt ; /em & gt ; association of congregations , washington , dc , june 1 , 2005 . 
dear member of congress : i am writing on behalf of the over 1 , 050 congregations that make up the unitarian universalist association in regard to h.r. 2123 , the school readiness act of 2005 , the legislation to reauthorize the head start program . 
the unitarian universalist association would like to express our continued support of this program , as we believe that head start is a successful and necessary program that helps prepare nearly 20 million low-income children for success in kindergarten and later life . 
we remain pleased with the general direction of the house bill as it comes out of the committee on education and the workforce . 
we are , however , concerned over proposals by committee leadership to offer a floor amendment to repeal civil rights protections in hiring in head start programs . 
the uua encourages you to pass a reauthorization bill that is truly bi-partisan in recognizing the successes of the head start program and maintaining the high quality of comprehensive services it provides without repeal of long-standing civil rights protections . 
we ask that you vote against any amendment on the floor that would repeal civil rights protections . 
if such an amendment is included in the final bill , we ask that you vote no on final passage of h.r. 2123 . 
we urge you to oppose the repeal of longstanding civil rights protections designed to protect head start teachers , staff , and parent volunteers from employment discrimination based on religion in federally funded head start programs . 
this provision has worked for 24 years , encouraging religious organizations to participate in head start and make invaluable contributions to children 's education and well-being , while maintaining constitutional and civil rights standards . 
allowing discrimination based on religion would significantly impede the important goals of head start , send a damaging message to head start students , and harm their education by separating students from their own teachers and parent volunteers . 
on behalf of the unitarian universalist association of congregations , i thank you for your consideration of our views on head start reauthorization . 
head start is an exemplary program that has a well-deserved reputation for delivering quality services to millions of our country 's children . 
this program is an excellent example of how religious organizations such as houses of worship work in partnership with the government without compromising either protections for religious minorities or the integrity of religious organizations . 
we urge the house to pass a bipartisan bill that will continue the success of head start without eliminating important civil rights provisions by voting no on any proposed amendment eliminating such provisions and voting no on final passage of a bill including such provisions . 
in faith , robert c. keithan , international union , clc , washington , dc , september 20 , 2005 . 
dear representative : on behalf of 1.8 million members of the service employees international union ( seiu ) , working in health care , building services , and federal , state , and local governments , including more than 220 , 000 early education workers throughout the united states , i write to encourage you to take a closer look at several key provisions in the head start reauthorization bill that could impact the quality of head start for children . 
as the school readiness act of 2005 ( h.r. 2123 ) moves to the house floor for a vote this week , we hope that you will use this time as an opportunity to improve the quality of head start programs that serve low-income children nationwide . 
since its inception in 1965 , the head start program has enrolled more than 22 million children . 
head start provides an array of comprehensive services to low-income parents and children that they may not otherwise have access to on their own . 
head start not only prepares children for school by providing a solid foundation in cognitive learning and socialization skills , but also helps make children `` ready to learn '' by providing comprehensive health , dental , and nutritional services critically needed by our at-risk children . 
seiu is committed to ensuring that children who participate in head start acquire the skills that prepare them for healthy , successful lives . 
this goal will not be realized unless certain steps are taken to improve the head start program . 
the head start bill passed by the house education and workforce committee contains several provisions that we support including greater set asides for migrant and seasonal workers and native americans , as well as early head start programs . 
however , seiu remains concerned about a number of provisions that may erode the quality of head start programs if not modified . 
we have outlined those concerns below . 
seiu supports continuing education for head start staff ; however , the bill 's requirement for additional training and education for head start staff may not become reality without the quality improvement funding to make the plan attainable . 
while seiu supports additional training and education for staff , we believe more funds also need to be provided for that training and education . 
head start teachers on average make $ 23 , 564 annually . 
further , there are no current incentives to retain highly qualified staff in head start programs after attaining degrees . 
additionally , head start needs sufficient resources to ensure every eligible child can participate and to increase the quality of programs . 
two out of five preschool children ( about 800 , 000 ) and 97 percent of infants and toddlers who qualify for early head start can not participate in the program simply because there are not enough resources invested in the program . 
we support full funding for head start so all eligible children have access to the head start program . 
also , the bill 's re-competition provisions need improvement . 
seiu is encouraged that the house bill does not require automatic re-competition for every grantee after the end of their grant period . 
however , the bill does require re-competition for grantees that have a `` deficiency '' during their grant period -- regardless of whether the deficiency has been resolved or not . 
in addition , the secretary has broad authority in identifying what a `` deficiency '' is , the finding of which would require programs to re-compete their grants . 
such uncertainty for all programs -- even those with stellar records of performance -- is counterproductive and would end programs ' ability to do any long-range planning . 
in the event a grantee is unsuccessful in a re-competition , seiu continues to have concerns for existing head start workers who may be displaced by re-competition . 
services and care-giving relationships for children should not be disrupted . 
moreover , seiu supports parental involvement in head start programs and encourages members of congress to re-think its plan to diminish the role of policy councils . 
policy councils offer real parental involvement regarding personnel and budgets . 
despite the advantages of parental involvement , the house bill changes governance responsibility to the board of directors , with policy councils playing only an advisory or consulting role . 
instead , congress should recognize that parents provide valuable insight into head start programs and can provide the necessary oversight of head start programs when armed with the proper training . 
seiu supports parental involvement through policy councils . 
finally , seiu vigorously opposes attempts to include language that would repeal longstanding civil rights protections that prohibit religious-based employment discrimination by head start agencies . 
the house bill currently maintains a provision designed to protect over 198 , 000 head start teachers and staff and over 1 , 450 , 000 parent volunteers from employment discrimination . 
this decades old civil rights provision has worked effectively since the inception of this program , allowing religious organizations to participate while maintaining constitutional civil and employment protections . 
the bill has gained broad support among diverse advocacy organizations , but that support will end if there is a successful effort to remove those protections in head start when the bill goes to the floor . 
seiu asks that you vote against any amendment offered that would roll back critical civil rights protections . 
if such an amendment is included in the final bill , we urge you vote no on final passage of h.r. 2123 . 
seiu remains troubled by the bill as it is currently constructed as outlined in the letter and we will endeavor to improve the legislation when the senate takes up reauthorization . 
again , should an amendment be offered that allows faith-based organizations to use religious discrimination against teachers , staff and parent volunteers working at head start programs , we urge you to vote no upon final passage of the bill . 
sincerely , anna burger , cdf action council , september 20 , 2005 . 
dear representative : as h.r. 2123 , the school readiness act of 2005 , moves towards a full vote in the house of representatives on thursday , september 22 , the children 's defense fund is pleased to support many of the provisions on which the education and workforce committee has worked so thoughtfully and diligently . 
we are especially pleased that the committee 's bipartisan bill maintains the integrity of the head start program and the quality performance standards that have helped head start successfully serve over 22 million children since the program began . 
we are extremely concerned , however , about a religious discrimination amendment that will be offered when the bill comes to the house floor . 
this unwarranted amendment would repeal the important civil rights protections that currently exist in head start that protect teachers and volunteers working in any head start program run by a religious organization . 
such an amendment would significantly hinder the goals of the head start program and the quality of care children receive . 
cdf acknowledges the continuing contribution of faith-based individuals and organizations , which have been the backbone of head start since its inception and have historically embraced serving our most vulnerable children when few others would even consider it . 
the religious discrimination provision , however , strikes at the very core of civil rights issues that so many of these individuals fought to secure . 
it is imperative that faith-based organizations be subject to the same civil rights laws that all programs who receive federal funding must abide by . 
the following are concerns raised by the amendment : teachers and staff could be hired based on their religion rather than their qualifications . 
tens of thousands of already at-risk 3- and 4-year-old children could lose their head start teachers , who often are the most important adults , other than their parents , with whom they have established meaningful relationships . 
head start has been an important source of employment for countless parents , but this provision could result in numerous parents losing their jobs , preventing families of head start children from climbing the ladder out of poverty . 
many head start volunteers are also parents . 
parent involvement has played a critical role in the success of head start . 
these volunteers could be let go as well if the provision passes . 
head start is a critical program for our country 's most vulnerable young children , providing them with valuable tools for future success in life . 
we are greatly concerned that removing civil rights protections for employees and volunteers would be detrimental to the children and families who benefit from this program . 
what message does this send to the head start children when their teachers , staff , and parents are denied opportunities in head start , simply because they do not share the federally-funded employers ' religious beliefs ? 
while substantial progress has been made creating a bipartisan bill with many positive provisions , the addition of a religious discrimination amendment would require cdf to oppose h.r. 2123 . 
thank you for your continuing commitment to improving head start and helping it reach more of the vulnerable children and families who benefit from its essential services . 
please oppose the religious discrimination amendment . 
sincerely yours , washington , dc , september 19 , 2005 . 
dear representative : on behalf of the more than 600 , 000 members of the human rights campaign , we write to express our grave concerns with certain provisions of the school readiness act ( h.r. 2123 ) that we understand may be added as the legislation moves to the floor for a vote . 
we are particularly concerned with statements made by chairman john boehner ( r-oh ) which indicate that his clear intention is to offer an amendment on the floor adding language to reverse the non-discrimination provisions currently in effect in the head start program . 
we do not believe it should be legal to discriminate with federal funds . 
we ask you to oppose any attempt to rollback these civil rights protections , which would undermine the current bipartisan nature of the bill . 
if an amendment is added on the floor which would roll back these civil rights protections , we urge you to oppose final passage of the school readiness act ( h.r. 2123 ) . 
as the nation 's largest gay , lesbian , bisexual and transgender civil rights organization , we oppose using federal funds to discriminate on any basis , including religion , which unfortunately has been used as a proxy for discrimination on the basis of sexual orientation and gender identity . 
two prominent cases illustrate this problem : bellmore v. united methodist children 's home and department of human resources of georgia and pedreira v. kentucky baptist homes for children . 
further , we are particularly concerned that any provisions that allow federally funded religious discrimination will pre-empt local and state non-discrimination laws that include sexual orientation and gender identity . 
while we do not hold a position on the overall legislation , we have serious concerns with a provision that we understand will be offered on the floor that would roll back civil rights protections that have been in place and working effectively since 1972 . 
by abandoning these non-discrimination protections , head start providers would be able to discriminate on the basis of religion in federally funded positions , even when engaging in purely secular early childhood education activities . 
faith-based organizations have been partnering successfully with the government for a number of years without the need to bypass civil rights laws in their efforts to provide social services . 
we do not object to faith-based organizations providing education-related services or other social services . 
indeed , we deeply respect the faith community 's vital contribution to care for the most vulnerable among us . 
just as it is important these vital programs continue to provide services , it also remains important that federal funds are not used to discriminate on the basis of religion or sexual orientation or gender identity . 
for these reasons , we urge you to oppose any amendment to the legislation which would rollback these critical civil rights protections and work to produce a bipartisan bill to reauthorize the head start program . 
a vote on an amendment permitting federally funded discrimination will be considered a key vote for the human rights campaign . 
should you have any questions please do not hesitate to contact angela clements on our staff at ( 202 ) 216-1520 . 
sincerely , & lt ; center & gt ; david m. smith , & lt ; /center & gt ; & lt ; center & gt ; christopher labonte , & lt ; /center & gt ; september 19 , 2005 . 
dear representative : on behalf of the 90 , 000 members and supporters of the national council of jewish women ( ncjw ) , i am writing to ask you to oppose the boehner amendment to h.r. 2123 , the school readiness act of 2005 , and to oppose final passage of the bill if this amendment is adopted . 
ncjw has been involved with head start since its inception , and we strongly support the program and h.r. 2123 as passed unanimously by the education and the workforce committee . 
efforts to amend the bill to open the door to religious discrimination would compromise the success of this program . 
ncjw believes that taxpayer funds should never be used to subsidize discrimination on any basis . 
since president nixon signed the head start program into law four decades ago , this acclaimed early childhood education program has included civil rights language protecting head start teachers from employment discrimination . 
this provision works well , allowing religious organizations to participate in head start while maintaining constitutional and civil rights standards . 
ncjw strongly supports the bipartisan effort to reauthorize head start . 
but the boehner amendment looms as a `` poison pill '' undermining this bipartisanship . 
house consideration of h.r. 2123 should focus on meeting the needs of disadvantaged children -- improving policy and providing sufficient funds to extend head start to all eligible children . 
the boehner amendment is totally unnecessary and interjects a controversial , political issue which has the potential to threaten the bill 's progress . 
the house of representatives must not roll back critical civil rights protections . 
for over a century , ncjw has been at the forefront of social change , raising its voice on important issues of public policy . 
inspired by our jewish values , ncjw has been , and continues to be , an advocate for the needs of women , children , and families and a strong supporter of equal rights and protections for everyone . 
i urge you to oppose any amendment allowing employment discrimination and to oppose the underlying bill if such an amendment is included . 
sincerely , phyllis snyder , national council of la raza , washington , dc , september 19 , 2005 . 
dear member of congress : on behalf of the national council of la raza ( nclr ) , the largest national latino civil rights and advocacy organization in the u.s. , i write on an issue of great importance to the hispanic community . 
on thursday , the house of representatives is scheduled to vote on legislation to reauthorize the head start program , the `` school readiness act of 2005 '' ( h.r. 2123 ) . 
this legislation is the result of bipartisan work of the committee on education and the workforce to address much-needed improvements to the program for latino children . 
however , nclr is concerned that this bipartisan work will be jeopardized by an amendment that would allow for employment discrimination based on religion in the program . 
nclr has long recognized that head start is a critically important program for ensuring that latino children begin their school careers ready to learn . 
for these reasons , nclr has pursued a reauthorization agenda focused on ensuring that head start continues to show progress in its effort to eliminate disparities in access and enhance the quality of services for latino and limited-english-proficient ( lep ) children and their families . 
we are pleased that members from both sides of the aisle supported this agenda and worked to include provisions in h.r. 2123 that significantly improve the program for latinos . 
these provisions include , but are not limited to , the following : additional resources for migrant and seasonal head start ( mshs ) program expansion , which will allow for thousands of farmworker children to exit the fields and enter the classroom . 
an accountability provision which ensures that head start providers serve new populations in their local communities through enhanced monitoring and evaluations of annual community assessments . 
a new requirement that the secretary conduct a study on the status of lep children and their families in head start and early head start programs . 
a new requirement that the secretary utilize training and technical assistance funds for activities aimed at assisting head start providers to conduct outreach and improve the quality of services to lep populations , particularly in states with new and rapidly growing lep populations . 
a new requirement that all head start parents receive information and services in their home language , when possible . 
a new requirement that , in addition to making progress toward acquisition of the english language , leps show progress toward the school readiness indicators outlined in the head start education performance standards . 
in addition , while nclr is pleased with the aforementioned provisions in h.r 2123 , we stand in solidarity with the broader civil rights community in our strong opposition to any amendment that could open the door to employment discrimination based on religion in the head start program . 
foremost , such an amendment is unnecessary for ensuring greater participation from the faith-based sector in the program ; faith-based providers have served as an important partner in head start since the program 's inception . 
moreover , such an amendment will only serve to deter critical attention and debate away from provisions in the legislation that have garnered strong bipartisan support , such as improvements to the program for latino children . 
we urge members of congress in closing , nclr affirms its strong support of provisions included in h.r. 2123 which increase access to and improve the quality of head start for latino children . 
we are certain that these policy changes will go a long way toward ensuring that latino children fully benefit from the program and that head start remains a model for early education into the future . 
sincerely , janet murguia , people for the american way , washington , dc , september 16 , 2005 . 
dear representative : on behalf of the more than 750 , 000 members and supporters of people for the american way , we urge you to maintain the bipartisan direction of h.r. 2123 , the `` school readiness act of 2003 , '' and oppose any attempt to repeal longstanding anti-discrimination protections . 
we commend you on your bipartisan efforts on head start reauthorization legislation . 
head start programs not only offer opportunities to thousands of low-income children , they also enrich their communities by providing job opportunities to over a third of the parents whose children have participated in the program . 
as it stands , this bill currently upholds key anti-discrimination provisions that have been part of head start since its inception . 
however , in a statement released by the committee on education and the workforce on may 5 , 2005 , chairman boehner stated that he anticipates and supports an amendment on the house floor to rollback longstanding critical civil rights protections . 
this type of amendment would be a direct attack on bipartisan , anti-discrimination provisions that have been part of head start since its creation in 1981 and can not be tolerated . 
people for the american way can not support a compromise that does not ensure that the existing civil rights protections in h.r. 2123 are not summarily removed on the house floor . 
proponents of anti-civil rights provisions claim there is a need to exempt religious organizations from anti-discrimination laws in order to protect the religious identity of that organization . 
this is simply not true . 
for decades , religious organizations have partnered with the government to provide social services . 
they have done so by separating their worship and related activities from government-funded social services , and , where necessary , creating a separate non-sectarian 501 ( c ) ( 3 ) organization to provide the services . 
under this model , religious organizations have provided an invaluable contribution to the education of thousands of head start students and to the communities in which they live . 
congress should not adopt changes that would alter this beneficial relationship , particularly when there is no evidence that religious organizations are actively seeking the religious exemption in question . 
again , we are pleased with the bipartisan direction of head start reauthorization legislation . 
however , we are concerned with any amendments which would rollback longstanding critical civil rights protections and thereby detrimentally affect head start teachers , students and their parents . 
the current , delicate balance encouraging the participation of religious organizations and compliance with our constitution should not be disrupted . 
for these reasons , we urge you to continue efforts to ensure that this legislation remains bipartisan , as well as oppose any attempts to repeal longstanding anti-discrimination provisions in h.r. 2123 . 
sincerely , & lt ; center & gt ; ralph g. neas , & lt ; /center & gt ; & lt ; center & gt ; president . 
& lt ; center & gt ; tanya m. clay , & lt ; /center & gt ; & lt ; center & gt ; deputy director of public policy . 
union for reform judaism , september 19 , 2005 . 
dear representatives : on behalf of the union for reform judaism , whose 900 congregations across north america encompass 1.5 million reform jews , and the central conference of american rabbis ( ccar ) , whose membership includes over 1800 reform rabbis , i strongly urge you to maintain the bipartisan character of the school readiness act of 2005 ( h.r. 2123 ) by opposing any attempt to repeal longstanding civil rights protections that prohibit faith-based head start centers from discriminating in whom they hire on the basis of religion . 
should such language be added to the bill , i urge you to vote against final passage . 
we expect government-funded programs to hire the people who are most qualified , not those whose religious beliefs best match those of an employer . 
this is especially problematic in relation to head start . 
one 's faith does not determine how one reads a book to preschoolers or sings the `` alphabet song. '' to deny children living in poverty the most qualified teacher is nothing short of an attack on head start 's core mission -- preparing children to succeed in school . 
since its founding , head start has prided itself on the strength of its family involvement component . 
head start has successfully trained many of its low-income parents to work at head start centers , helping parents rise out of poverty . 
in fact , the family and child experiences survey , prepared in january 2002 for the u.s. department of health and human services , found that over 40 percent of head start staff members had children in their households who were current or former head start participants . 
on the day this bill becomes law , faith-based head start programs could fire such staff members because of their religious beliefs . 
a head start center could refuse to consider a qualified parent for a job because of the way the parent chooses to worship . 
experience teaches us that a broad exemption for religious organizations would permit religious groups to use government money to discriminate based on race , sexual orientation , and marital status . 
we are pleased with the bi-partisan efforts to improve upon previous head start reauthorization attempts . 
however , on the day that h.r. 2123 was introduced , representative john boehner ( r-oh ) stated his intention to offer an amendment to roll-back the current civil rights protections within the head start program when the bill is considered by the full house . 
to plainly state such intentions diminishes the much-heralded bipartisan spirit of the bill and undermines the gains made thus far in the mark-up process . 
our tradition includes a story of a teacher whose prayer for rain was answered promptly . 
asked to tell of his special merit , he replied : `` i teach children of the poor as well as of the rich ; i accept no fee from any who can not afford it ; and i have a fishpond to delight the children and to encourage them to do their lessons. '' since 1965 , through its comprehensive services and high quality standards , head start has striven to give millions of children an equal opportunity to succeed in school , nurturing their love of learning and delight in life . 
i urge you to protect such opportunity for our nation 's teachers , parents , and children by opposing any attempt to repeal the civil rights protections in h.r. 2123 . 
respectfully , dear representative : we , the undersigned religious and religiously affiliated organizations , write to urge you to oppose the planned boehner religious discrimination amendment to the school readiness act ( h.r. 2123 ) , the bill reauthorizing the head start program . 
the bill approved 48-0 by the house committee on education and the workforce that reaches the house floor is the product of many months of hard work resulting in a strong bipartisan agreement . 
it maintains critical civil rights protections in head start , preventing religious discrimination in federally funded head start positions . 
any attempts to amend the bill and repeal these protections threaten not only the bipartisan spirit of the bill , but the integrity of the head start program itself . 
if the promised boehner amendment passes , we urge you to vote `` no '' to h.r. 2123 . 
we are disappointed that an otherwise acceptable bill could be jeopardized with such an unwise amendment . 
we represent a diverse array of religions , covering the political and ideological spectrum . 
we stand united to oppose this unwarranted attack on a vital civil rights provision that protects over 1.6 million teachers and parent volunteers from having to choose between their religion and their participation in the local head start program . 
the bipartisan bill that passed unanimously out of the committee on education and the workforce has the potential to garner support from a broad range of groups , including all of the religious groups on this letter , but not if the proposed language is included . 
as religious institutions , we support preserving the autonomy of religious organizations with respect to hiring decisions made in privately funded programs . 
however , we also recognize the importance of ensuring that taxpayer dollars do not fund positions connected with the operation of the program itself where candidates may be disqualified because of the religion they practice . 
the longstanding nondiscrimination provision included in head start legislation since 1972 strikes the appropriate balance between religious autonomy and nondiscrimination . 
for over three decades , as religious and religiously affiliated organizations , we strive to make the world a better place for the next generation and generations to follow . 
the head start program is an extremely successful government funded means of achieving this goal , providing opportunities for nearly one million at-risk children each year . 
we urge you to oppose any effort , such as rep . 
boehner 's planned floor amendment , to change this crucial program by stripping its civil rights protections and allowing providers to discriminate on religious grounds . 
thank you for your consideration of this important matter . 
respectfully , omb watch , washington , dc , september 16 , 2005 . 
dear representative : omb watch strongly urges you to oppose the any attempt to include `` charitable choice '' provisions in the head start program , which would allow religious organizations to discriminate on the basis of the religion when hiring for federally funded programs . 
religious organizations play a meaningful role in the delivery of social service programs . 
we do not question the right of religious organizations to participate in federal programs , nor their ability to avail themselves of an exemption under title vii of the civil rights act of 1964 that allows religious organizations to hire co-religionists with their own money . 
however , we do question whether federal dollars should fund discrimination by the very few religious organizations that refuse to follow the same rules that all other organizations participating in federal programs follow . 
although religious employers have the right under title vii to apply religious tests to employees , the constitution requires that the direct receipt and administration of federal funds remove that exemption . 
in addition , the federal government has constitutional obligations reinforced by chief justice rehnquist 's majority opinion in bowen v. kendrick , 487 u.s. 589 ( 1988 ) . 
the court stated that although the constitution does not bar religious organizations from participating in federal programs , it requires ( 1 ) that no one participating in a federal program can `` discriminate on the basis of religion '' and ( 2 ) that all federal programs must be carried out in a `` lawful , secular manner. '' id . 
at 609 , 612 . 
faith-based and secular grantees face high standards and must be treated equally . 
the acceptance of federal funds -- taxpayer money -- should require all recipients to practice non-discrimination in hiring as it relates to those funds . 
i urge you to maintain the integrity of religious grantees and prevent government-funded religious discrimination by opposing any attempt to include `` charitable choice '' provisions into the head start program . 
if you have any questions , please contact jennifer lowe at 202-234-8494 . 
thank you for your attention to this matter . 
sincerely , gary bass , 

Now search for something else! Another two terms that might show up. elections and chaos? Whatever you thnik might be interesting.


In [278]:
All_tokens['chaos'] = All_tokens['chaos'].sort_values(ascending=False) >= 1
All_tokens[All_tokens['chaos'] == True].count().head(1)


Out[278]:
000    3
dtype: int64

Enough of this garbage, let's cluster

Using a simple counting vectorizer, cluster the documents into eight categories, telling me what the top terms are per category.

Using a term frequency vectorizer, cluster the documents into eight categories, telling me what the top terms are per category.

Using a term frequency inverse document frequency vectorizer, cluster the documents into eight categories, telling me what the top terms are per category.


In [ ]:
#simple counting vectorizer,

In [291]:
from sklearn.cluster import KMeans
number_of_clusters = 8
km = KMeans(n_clusters=number_of_clusters)

In [292]:
count_vectorizer = CountVectorizer(stop_words='english')
X = count_vectorizer.fit_transform(All_speeches)
km.fit(X)


Out[292]:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=8, n_init=10,
    n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001,
    verbose=0)

In [293]:
print("Top terms per cluster:")
order_centroids = km.cluster_centers_.argsort()[:, ::-1]
terms = count_vectorizer.get_feature_names()
for i in range(number_of_clusters):
    top_ten_words = [terms[ind] for ind in order_centroids[i, :5]]
    print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))


Top terms per cluster:
Cluster 0: start head children program amendment
Cluster 1: head start religious rights civil
Cluster 2: nbsp amp lt gt trade
Cluster 3: mr chairman time gentleman amendment
Cluster 4: association national restaurant contractors chamber
Cluster 5: church financial embezzlement says churches
Cluster 6: rule 11 rules federal 420
Cluster 7: house mr elections time states

In [ ]:
# term frequency vectorizer,

In [296]:
vectorizer = TfidfVectorizer(use_idf=True, stop_words='english')
X = vectorizer.fit_transform(All_speeches)

In [297]:
number_of_clusters = 8
km = KMeans(n_clusters=number_of_clusters)
km.fit(X)


Out[297]:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=8, n_init=10,
    n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001,
    verbose=0)

In [360]:
print("Top terms per cluster:")
order_centroids = km.cluster_centers_.argsort()[:, ::-1]
terms = count_vectorizer.get_feature_names()
for i in range(number_of_clusters):
    top_ten_words = [terms[ind] for ind in order_centroids[i, :10]]
    print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))


Top terms per cluster:
Cluster 0: harry said just like hermione eyes know time didn don
Cluster 1: harry said hermione just like know time looked asked ron

In [302]:
#term frequency inverse document frequency vectorizer

In [357]:
def oh_tokenizer(str_input):
    words = re.sub(r"[^A-Za-z0-9\-]", " ", str_input).lower().split()
    return words

l2_vectorizer = TfidfVectorizer(use_idf=True, stop_words='english', tokenizer=oh_tokenizer) 
X = l2_vectorizer.fit_transform(speeches_df['content'])
l2_df = pd.DataFrame(X.toarray(), columns=l2_vectorizer.get_feature_names())
for i in range(number_of_clusters):
    top_ten_words = [l2_df[ind] for ind in order_centroids[i, :9]]
    print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))


---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
/usr/local/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1944             try:
-> 1945                 return self._engine.get_loc(key)
   1946             except KeyError:

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)()

KeyError: 18966

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
<ipython-input-357-a1c652d656f2> in <module>()
      7 l2_df = pd.DataFrame(X.toarray(), columns=l2_vectorizer.get_feature_names())
      8 for i in range(number_of_clusters):
----> 9     top_ten_words = [l2_df[ind] for ind in order_centroids[i, :9]]
     10     print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))

<ipython-input-357-a1c652d656f2> in <listcomp>(.0)
      7 l2_df = pd.DataFrame(X.toarray(), columns=l2_vectorizer.get_feature_names())
      8 for i in range(number_of_clusters):
----> 9     top_ten_words = [l2_df[ind] for ind in order_centroids[i, :9]]
     10     print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in __getitem__(self, key)
   1995             return self._getitem_multilevel(key)
   1996         else:
-> 1997             return self._getitem_column(key)
   1998 
   1999     def _getitem_column(self, key):

/usr/local/lib/python3.5/site-packages/pandas/core/frame.py in _getitem_column(self, key)
   2002         # get column
   2003         if self.columns.is_unique:
-> 2004             return self._get_item_cache(key)
   2005 
   2006         # duplicate columns & possible reduce dimensionality

/usr/local/lib/python3.5/site-packages/pandas/core/generic.py in _get_item_cache(self, item)
   1348         res = cache.get(item)
   1349         if res is None:
-> 1350             values = self._data.get(item)
   1351             res = self._box_item_values(item, values)
   1352             cache[item] = res

/usr/local/lib/python3.5/site-packages/pandas/core/internals.py in get(self, item, fastpath)
   3288 
   3289             if not isnull(item):
-> 3290                 loc = self.items.get_loc(item)
   3291             else:
   3292                 indexer = np.arange(len(self.items))[isnull(self.items)]

/usr/local/lib/python3.5/site-packages/pandas/indexes/base.py in get_loc(self, key, method, tolerance)
   1945                 return self._engine.get_loc(key)
   1946             except KeyError:
-> 1947                 return self._engine.get_loc(self._maybe_cast_indexer(key))
   1948 
   1949         indexer = self.get_indexer([key], method=method, tolerance=tolerance)

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4154)()

pandas/index.pyx in pandas.index.IndexEngine.get_loc (pandas/index.c:4018)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12368)()

pandas/hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12322)()

KeyError: 18966

Which one do you think works the best?

Not sure. The last one term frequency inverse I can't get to work. So I am going with number 2.

Harry Potter time

I have a scraped collection of Harry Potter fanfiction at https://github.com/ledeprogram/courses/raw/master/algorithms/data/hp.zip.

I want you to read them in, vectorize them and cluster them. Use this process to find out the two types of Harry Potter fanfiction. What is your hypothesis?


In [309]:
!curl -O https://github.com/ledeprogram/courses/raw/master/algorithms/data/hp.zip


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   149  100   149    0     0    131      0  0:00:01  0:00:01 --:--:--   131

In [314]:
!unzip hp.zip


unzip:  cannot find or open hp.zip, hp.zip.zip or hp.zip.ZIP.

In [326]:
import glob

In [333]:
paths = glob.glob('hp/*.txt')

In [334]:
paths[:5]


Out[334]:
['hp/10001898.txt',
 'hp/10004131.txt',
 'hp/10004927.txt',
 'hp/10007980.txt',
 'hp/10010343.txt']

In [331]:
len(paths)


Out[331]:
1874

In [335]:
Harry_Potter_fiction = []
for path in paths:
    with open(path) as Harry_file:
        speech = {
            'pathname': path,
            'filename': path.split('/')[-1],
            'content': Harry_file.read()
        }
    Harry_Potter_fiction.append(speech)
Harry_df = pd.DataFrame(Harry_Potter_fiction)
Harry_df.head()


Out[335]:
content filename pathname
0 Prologue: The MissionDisclaimer: All character... 10001898.txt hp/10001898.txt
1 BlackDisclaimer: I do not own Harry PotterAuth... 10004131.txt hp/10004131.txt
2 Chapter 1"I'm pregnant.""""Mum please say some... 10004927.txt hp/10004927.txt
3 Author's Note: Hey, just so you know, this is ... 10007980.txt hp/10007980.txt
4 Disclaimer: I do not own Harry Potter and frie... 10010343.txt hp/10010343.txt

In [337]:
All_of_Harry = Harry_df['content']

In [339]:
All_of_Harry.head()


Out[339]:
0    Prologue: The MissionDisclaimer: All character...
1    BlackDisclaimer: I do not own Harry PotterAuth...
2    Chapter 1"I'm pregnant.""""Mum please say some...
3    Author's Note: Hey, just so you know, this is ...
4    Disclaimer: I do not own Harry Potter and frie...
Name: content, dtype: object

Term Frequency Vectorizer


In [363]:
vectorizer = TfidfVectorizer(use_idf=True, stop_words='english')
X = vectorizer.fit_transform(All_of_Harry)

In [364]:
# KMeans clustering is a method of clustering.
from sklearn.cluster import KMeans

number_of_clusters = 2
km = KMeans(n_clusters=number_of_clusters)
km.fit(X)


Out[364]:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=2, n_init=10,
    n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001,
    verbose=0)

In [365]:
print("Top terms per cluster:")
order_centroids = km.cluster_centers_.argsort()[:, ::-1]
terms = vectorizer.get_feature_names()
for i in range(number_of_clusters):
    top_ten_words = [terms[ind] for ind in order_centroids[i, :10]]
    print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))


Top terms per cluster:
Cluster 0: lily james sirius remus said harry just eyes potter peter
Cluster 1: harry hermione draco said just ron like ginny know eyes

In [ ]:
#Cluster 1 is about Lily and James, whoever they are. Wait: His parents.
#Cluster 2 is about Harry and Hermione.

Simple Counting Vectorizer


In [366]:
from sklearn.cluster import KMeans
number_of_clusters = 2
km = KMeans(n_clusters=number_of_clusters)

In [367]:
count_vectorizer = CountVectorizer(stop_words='english')
X = count_vectorizer.fit_transform(All_of_Harry)
km.fit(X)


Out[367]:
KMeans(copy_x=True, init='k-means++', max_iter=300, n_clusters=2, n_init=10,
    n_jobs=1, precompute_distances='auto', random_state=None, tol=0.0001,
    verbose=0)

In [368]:
print("Top terms per cluster:")
order_centroids = km.cluster_centers_.argsort()[:, ::-1]
terms = count_vectorizer.get_feature_names()
for i in range(number_of_clusters):
    top_ten_words = [terms[ind] for ind in order_centroids[i, :10]]
    print("Cluster {}: {}".format(i, ' '.join(top_ten_words)))


Top terms per cluster:
Cluster 0: harry said just like hermione eyes know time didn don
Cluster 1: harry said hermione just like know time looked asked ron

In [ ]:


In [ ]:


In [ ]: